Computational Optimization and Applications

, Volume 48, Issue 2, pp 273–307 | Cite as

A coordinate gradient descent method for 1-regularized convex minimization

  • Sangwoon YunEmail author
  • Kim-Chuan Toh


In applications such as signal processing and statistics, many problems involve finding sparse solutions to under-determined linear systems of equations. These problems can be formulated as a structured nonsmooth optimization problems, i.e., the problem of minimizing 1-regularized linear least squares problems. In this paper, we propose a block coordinate gradient descent method (abbreviated as CGD) to solve the more general 1-regularized convex minimization problems, i.e., the problem of minimizing an 1-regularized convex smooth function. We establish a Q-linear convergence rate for our method when the coordinate block is chosen by a Gauss-Southwell-type rule to ensure sufficient descent. We propose efficient implementations of the CGD method and report numerical results for solving large-scale 1-regularized linear least squares problems arising in compressed sensing and image deconvolution as well as large-scale 1-regularized logistic regression problems for feature selection in data classification. Comparison with several state-of-the-art algorithms specifically designed for solving large-scale 1-regularized linear least squares or logistic regression problems suggests that an efficiently implemented CGD method may outperform these algorithms despite the fact that the CGD method is not specifically designed just to solve these special classes of problems.


Coordinate gradient descent Q-linear convergence 1-Regularization Compressed sensing Image deconvolution Linear least squares Logistic regression Convex optimization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999) zbMATHGoogle Scholar
  2. 2.
    Bradley, P.S., Fayyad, U.M., Mangasarian, O.L.: Mathematical programming for data mining: formulations and challenges. INFORMS J. Comput. 11, 217–238 (1999) CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    Candès, E.J., Tao, T.: Nearly optimal signal recovery from random projections: Universal encoding strategies. IEEE Trans. Inf. Theory 52, 5406–5425 (2006) CrossRefGoogle Scholar
  4. 4.
    Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52, 489–509 (2006) CrossRefGoogle Scholar
  5. 5.
    Chang, C.-C., Lin, C.-J.: LIBSVM—A library for support vector machines.
  6. 6.
    Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1999) CrossRefzbMATHMathSciNetGoogle Scholar
  7. 7.
    Daubechies, I., De Friese, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57, 1413–1457 (2004) CrossRefzbMATHGoogle Scholar
  8. 8.
    Donoho, D.: Compressed sensing. IEEE Trans. Inf. Theory 52, 1289–1306 (2006) CrossRefMathSciNetGoogle Scholar
  9. 9.
    Donoho, D., Tsaig, Y.: Fast solution of 1-norm minimization problems when the solution may be sparse. IEEE Trans. Inf. Theory 54, 4789–4812 (2008) CrossRefMathSciNetGoogle Scholar
  10. 10.
    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004) CrossRefzbMATHMathSciNetGoogle Scholar
  11. 11.
    Figueiredo, M., Nowak, R.: An EM algorithm for wavelet-based image restoration. IEEE Trans. Image Process. 12, 906–916 (2003) CrossRefMathSciNetGoogle Scholar
  12. 12.
    Figueiredo, M., Nowak, R.: A bound optimization approach to wavelet-based image deconvolution. In: IEEE Int. Conf. on Image Processing—ICIP’05, 2005 Google Scholar
  13. 13.
    Figueiredo, M., Nowak, R., Wright, S.J.: Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process. 1, 586–598 (2007) CrossRefGoogle Scholar
  14. 14.
    Fuchs, J.-J.: On sparse representations in arbitrary redundant bases. IEEE Trans. Inf. Theory 50, 1341–1344 (2004) CrossRefGoogle Scholar
  15. 15.
    Genkin, A., Lewis, D., Madigan, D.: Large-scale Bayesian logistic regression for text categorization. Technometrics 49, 291–304 (2007) CrossRefMathSciNetGoogle Scholar
  16. 16.
    Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999) CrossRefGoogle Scholar
  17. 17.
    Hale, E.T., Yin, W., Zhang, Y.: A fixed-point continuation method for 1-regularized minimization with applications to compressed sensing. CAAM Technical report TR07-07, Department of Computational and Applied Mathematics, Rice University (July 2007) Google Scholar
  18. 18.
    Kim, S.-J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale 1-regularized least squares. IEEE J. Sel. Top. Signal Process. 1, 606–617 (2007) CrossRefGoogle Scholar
  19. 19.
    Koh, K., Kim, S.-J., Boyd, S.: An interior-point method for large-scale 1-regularized logistic regression. J. Mach. Learn. Res. 8, 1519–1555 (2007) MathSciNetGoogle Scholar
  20. 20.
    Lee, S., Lee, H., Abeel, P., Ng, A.: Efficient 1-regularized logistic regression. In: Proceedings of the 21st National Conference on Artificial Intelligence, 2006 Google Scholar
  21. 21.
    Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: A new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004) Google Scholar
  22. 22.
    Lin, C.-J., Weng, R.C., Keerthi, S.S.: Trust region Newton method for large-scale logistic regression. J. Mach. Learn. Res. 9, 627–650 (2008) MathSciNetGoogle Scholar
  23. 23.
    Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993) CrossRefMathSciNetGoogle Scholar
  24. 24.
    Mangasarian, O.L., Musicant, D.R.: Large scale kernel regression via linear programming. Mach. Learn. 46, 255–269 (2002) CrossRefzbMATHGoogle Scholar
  25. 25.
    Ng, A.Y.: Feature selection, 1 vs. 2 regularization, and rotational invariance, In: Proceedings of the 21st International Conference on Machine Learning, 2004 Google Scholar
  26. 26.
    Osborne, M., Presnell, B., Turlach, B.: A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20, 389–403 (2000) CrossRefzbMATHMathSciNetGoogle Scholar
  27. 27.
    Park, M., Hastie, T.: An 1 regularization-path algorithm for generalized linear models. J. R. Stat. Soc. B 69, 659–677 (2007) CrossRefMathSciNetGoogle Scholar
  28. 28.
    Sardy, S., Tseng, P.: AMlet, RAMlet, and GAMlet: automatic nonlinear fitting of additive models, robust and generalized, with wavelets. J. Comput. Graph. Stat. 13, 283–309 (2004) CrossRefMathSciNetGoogle Scholar
  29. 29.
    Shi, W., Wahba, G., Wright, S.J., Lee, K., Klein, R., Klein, B.: Lasso-patternsearch algorithm with application to ophthalmology and genomic data. Stat. Interface 1, 137–153 (2008) MathSciNetGoogle Scholar
  30. 30.
    Starck, J.-L., Nguyen, M., Murtagh, F.: Wavelets and curvelets for image deconvolution: a combined approach. Signal Process. 83, 2279–2283 (2003) CrossRefzbMATHGoogle Scholar
  31. 31.
    Tropp, J.A.: Just relax: Convex programming methods for identifying sparse signals. IEEE Trans. Inf. Theory 51, 1030–1051 (2006) CrossRefMathSciNetGoogle Scholar
  32. 32.
    Tsaig, Y., Donoho, D.: Extensions of compressed sensing. Signal Process. 86, 533–548 (2005) CrossRefGoogle Scholar
  33. 33.
    Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009) CrossRefzbMATHMathSciNetGoogle Scholar
  34. 34.
    Wang, L.: Efficient regularized solution path algorithms with applications in machine learning and data mining. Ph.D. thesis, University of Michigan (2008) Google Scholar
  35. 35.
    Wright, S.J., Nowak, R., Figueiredo, M.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. (2007, to appear) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Singapore-MIT AllianceSingaporeSingapore
  2. 2.Department of MathematicsNational University of SingaporeSingaporeSingapore

Personalised recommendations