Mathematical Programming

, Volume 140, Issue 1, pp 125–161 | Cite as

Gradient methods for minimizing composite functions

  • Yu. NesterovEmail author
Full Length Paper Series B


In this paper we analyze several new methods for solving optimization problems with the objective function formed as a sum of two terms: one is smooth and given by a black-box oracle, and another is a simple general convex function with known structure. Despite the absence of good properties of the sum, such problems, both in convex and nonconvex cases, can be solved with efficiency typical for the first part of the objective. For convex problems of the above structure, we consider primal and dual variants of the gradient method (with convergence rate \(O\left({1 \over k}\right)\)), and an accelerated multistep version with convergence rate \(O\left({1 \over k^2}\right)\), where \(k\) is the iteration counter. For nonconvex problems with this structure, we prove convergence to a point from which there is no descent direction. In contrast, we show that for general nonsmooth, nonconvex problems, even resolving the question of whether a descent direction exists from a point is NP-hard. For all methods, we suggest some efficient “line search” procedures and show that the additional computational work necessary for estimating the unknown problem class parameters can only multiply the complexity of each iteration by a small constant factor. We present also the results of preliminary computational experiments, which confirm the superiority of the accelerated scheme.


Local optimization Convex Optimization Nonsmooth optimization Complexity theory Black-box model Optimal methods Structural optimization \(l_1\)-Regularization 

Mathematics Subject Classification

90C25 90C47 68Q25 



The author would like to thank M. Overton, Y. Xia, and anonymous referees for numerous useful suggestions.


  1. 1.
    Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Claerbout, J., Muir, F.: Robust modelling of eratic data. Geophysics 38, 826–844 (1973)CrossRefGoogle Scholar
  3. 3.
    Figueiredo, M., Novak, R., Wright, S.J.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. Submitted for publicationGoogle Scholar
  4. 4.
    Fukushima, M., Mine, H.: A generalized proximal point algorithm for certain nonconvex problems. Int. J. Sys. Sci. 12(8), 989–1000 (1981)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Kim, S.-J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: A method for large-scale \(l_1\)-regularized least-squares problems with applications in signal processing and statistics. Stanford University, March 20, Research report (2007)Google Scholar
  6. 6.
    Levy, S., Fullagar, P.: Reconstruction of a sparse spike train from a portion of its spectrum and application to high-resolution deconvolution. Geophysics 46, 1235–1243 (1981)CrossRefGoogle Scholar
  7. 7.
    Miller, A.: Subset Selection in Regression. Chapman and Hall, London (2002)zbMATHCrossRefGoogle Scholar
  8. 8.
    Nemirovsky, A., Yudin, D.: Informational Complexity and Efficient Methods for Solution of Convex Extremal Problems. Wiley, New-York (1983)Google Scholar
  9. 9.
    Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer, Boston (2004)zbMATHGoogle Scholar
  10. 10.
    Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. (A) 103(1), 127–152 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Paper \(\#\) 2007/76, CORE (2007)Google Scholar
  12. 12.
    Nesterov, Y.: Rounding of convex sets and efficient gradient methods for linear programming problems. Optim. Methods Softw. 23(1), 109–135 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Nesterov, Y.: Accelerating the cubic regularization of Newton’s method on convex problems. Math. Program. 112(1), 159–181 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Nesterov, Y., Nemirovskii, A.: Interior Point Polynomial Methods in Convex Programming: Theory and Applications. SIAM, Philadelphia (1994)CrossRefGoogle Scholar
  15. 15.
    Ortega, J., Rheinboldt, W.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970)zbMATHGoogle Scholar
  16. 16.
    Santosa, F., Symes, W.: Linear inversion of band-limited reflection histograms. SIAM J. Sci. Stat. Comput. 7, 1307–1330 (1986)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Taylor, H., Bank, S., McCoy, J.: Deconvolution with the \(l_1\) norm. Geophysics 44, 39–52 (1979)CrossRefGoogle Scholar
  18. 18.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Tropp, J.: Just relax: convex programming methods for identifying sparse signals. IEEE Trans. Inf. Theory 51, 1030–1051 (2006)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Wright, S.J.: Solving \(l_{1}\)-Regularized Regression Problems. Talk at International Conference “Combinatorics and Optimization”, Waterloo (June 2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg and Mathematical Optimization Society 2012

Authors and Affiliations

  1. 1.Center for Operations Research and Econometrics (CORE)Catholic University of Louvain (UCL)Louvain-la-NeuveBelgium

Personalised recommendations