Computational Optimization and Applications

, Volume 61, Issue 3, pp 609–634 | Cite as

Path following in the exact penalty method of convex programming



Classical penalty methods solve a sequence of unconstrained problems that put greater and greater stress on meeting the constraints. In the limit as the penalty constant tends to \(\infty \), one recovers the constrained solution. In the exact penalty method, squared penalties are replaced by absolute value penalties, and the solution is recovered for a finite value of the penalty constant. In practice, the kinks in the penalty and the unknown magnitude of the penalty constant prevent wide application of the exact penalty method in nonlinear programming. In this article, we examine a strategy of path following consistent with the exact penalty method. Instead of performing optimization at a single penalty constant, we trace the solution as a continuous function of the penalty constant. Thus, path following starts at the unconstrained solution and follows the solution path as the penalty constant increases. In the process, the solution path hits, slides along, and exits from the various constraints. For quadratic programming, the solution path is piecewise linear and takes large jumps from constraint to constraint. For a general convex program, the solution path is piecewise smooth, and path following operates by numerically solving an ordinary differential equation segment by segment. Our diverse applications to (a) projection onto a convex set, (b) nonnegative least squares, (c) quadratically constrained quadratic programming, (d) geometric programming, and (e) semidefinite programming illustrate the mechanics and potential of path following. The final detour to image denoising demonstrates the relevance of path following to regularized estimation in inverse problems. In regularized estimation, one follows the solution path as the penalty constant decreases from a large value.


Constrained convex optimization Exact penalty Geometric programming Ordinary differential equation Quadratically constrained quadratic programming Regularization Semidefinite programming 

Mathematics Subject Classification

65K05 90C25 



Research supported in part by National Science Foundatation Grant DMS-1310319 and National Institutes of Health Grants GM53275, MH59490, HG006139 and GM105785.

Supplementary material

10589_2015_9732_MOESM1_ESM.png (181 kb)
Supplementary material 1 (png 181 KB)


  1. 1.
    Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)MATHCrossRefGoogle Scholar
  2. 2.
    Forsgren, A., Gill, P.E., Wright, M.H.: Interior methods for nonlinear optimization. SIAM Rev. 44, 525–597 (2002)MATHMathSciNetCrossRefGoogle Scholar
  3. 3.
    Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming. International Series in Operations Research & Management Science, 3rd edn, p. 116. Springer, New York (2008)Google Scholar
  4. 4.
    Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, 2nd edn. Springer, New York (2006)Google Scholar
  5. 5.
    Ruszczyński, A.: Nonlinear Optim. Princeton University Press, Princeton (2006)Google Scholar
  6. 6.
    Zangwill, W.I.: Non-linear programming via penalty functions. Manag. Sci. 13(5), 344–358 (1967)MATHMathSciNetCrossRefGoogle Scholar
  7. 7.
    Hestenes, M.R.: Optimization Theory: The Finite Dimensional Case. Wiley-Interscience (Wiley), New York (1975)Google Scholar
  8. 8.
    Zhou, H., Lange, K.: A path algorithm for constrained estimation. J. Comput. Gr. Stat. 22, 261–283 (2013)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Cottle, R.W., Pang, J.-S., Stone, R.E.: The Linear Complementarity Problem. Computer Science and Scientific Computing. Academic Press Inc., Boston (1992)Google Scholar
  10. 10.
    Watson, L.T.: Numerical linear algebra aspects of globally convergent homotopy methods. SIAM Rev. 28(4), 529–545 (1986)MATHMathSciNetCrossRefGoogle Scholar
  11. 11.
    Watson, L.T.: Theory of globally convergent probability-one homotopies for nonlinear programming. SIAM J. Optim., 11(3), 761–780, electronic (2000/2001)Google Scholar
  12. 12.
    Zangwill, W.I., Garcia, C.B.: Pathways to Solutions, Fixed Points, and Equilibria. Prentice-Hall Series in Computational Mathematics. Prentice-Hall, New Jersey (1981)Google Scholar
  13. 13.
    Zhou, H., Wu, Y.: A generic path algorithm for regularized statistical estimation. J. Am. Statist. Assoc. 109(506), 686–699 (2014)CrossRefGoogle Scholar
  14. 14.
    Bertsekas, D.P.: Convex Analysis and Optimization. Athena Scientific, Belmont. With Angelia Nedić and Asuman E. Ozdaglar (2003)Google Scholar
  15. 15.
    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004). (with discussion, and a rejoinder by the authors)MATHMathSciNetCrossRefGoogle Scholar
  16. 16.
    Osborne, M.R., Presnell, B., Turlach, B.A.: A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20(3), 389–403 (2000)MATHMathSciNetCrossRefGoogle Scholar
  17. 17.
    Tibshirani, R.J., Taylor, J.: The solution path of the generalized lasso. Ann. Stat. 39(3), 1335–1371 (2011)MATHMathSciNetCrossRefGoogle Scholar
  18. 18.
    Lange, K.: Optimization. Springer Texts in Statistics. Springer, New York (2004)Google Scholar
  19. 19.
    Magnus, J.R., Neudecker, H.: Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley Series in Probability and Statistics. Wiley, Chichester (1999)Google Scholar
  20. 20.
    Lange, K.: Numerical Analysis for Statisticians. Statistics and Computing, 2nd edn. Springer, New York (2010)Google Scholar
  21. 21.
    Chi, E., Lange, K.: Splitting methods for convex clustering. J. Comput. Gr. Stat. (in press) (2014)Google Scholar
  22. 22.
    Lawson, C.L., Hanson, R.J.: Solving least squares problems. Classics in Applied Mathematics, Society for Industrial Mathematics, new ed., (1987)Google Scholar
  23. 23.
    Dykstra, R.L.: An algorithm for restricted least squares regression. J. Am. Stat. Assoc. 78(384), 837–842 (1983)MATHMathSciNetCrossRefGoogle Scholar
  24. 24.
    Deutsch, F.: Best Approximation in Inner Product Spaces. CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC, 7. Springer, New York (2001)CrossRefGoogle Scholar
  25. 25.
    Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. Comput. Statist. Data Anal. 52(1), 155–173 (2007)MATHMathSciNetCrossRefGoogle Scholar
  26. 26.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  27. 27.
    Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization, in NIPS, pp. 556–562, MIT Press (2001)Google Scholar
  28. 28.
    Boyd, S., Kim, S.-J., Vandenberghe, L., Hassibi, A.: A tutorial on geometric programming. Optim. Eng. 8(1), 67–127 (2007)MATHMathSciNetCrossRefGoogle Scholar
  29. 29.
    Ecker, J.G.: Geometric programming: methods, computations and applications. SIAM Rev. 22(3), 338–362 (1980)MATHMathSciNetCrossRefGoogle Scholar
  30. 30.
    Peressini, A.L., Sullivan, F.E., Uhl Jr, J.J.: The Mathematics of Nonlinear Programming. Undergraduate Texts in Mathematics. Springer, New York (1988)CrossRefGoogle Scholar
  31. 31.
    Peterson, E.L.: Geometric programming. SIAM Rev. 18(1), 1–51 (1976)MATHMathSciNetCrossRefGoogle Scholar
  32. 32.
    Passy, U., Wilde, D.J.: A geometric programming algorithm for solving chemical equilibrium problems. SIAM J. Appl. Math. 16, 363–373 (1968)CrossRefGoogle Scholar
  33. 33.
    Boyd, S.P., Kim, S.-J., Patil, D.D., Horowitz, M.A.: Digital circuit optimization via geometric programming. Oper. Res. 53, 899–932 (2005)MATHMathSciNetCrossRefGoogle Scholar
  34. 34.
    Mazumdar, M., Jefferson, T.R.: Maximum likelihood estimates for multinomial probabilities via geometric programming. Biometrika 70(1), 257–261 (1983)MATHMathSciNetCrossRefGoogle Scholar
  35. 35.
    Feigin, P.D., Passy, U.: The geometric programming dual to the extinction probability problem in simple branching processes. Ann. Probab. 9(3), 498–503 (1981)MATHMathSciNetCrossRefGoogle Scholar
  36. 36.
    Lange, K., Zhou, H.: MM algorithms for geometric and signomial programming. Math. Program.Ser. A 143, 339–356 (2014)MATHMathSciNetCrossRefGoogle Scholar
  37. 37.
    Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38(1), 49–95 (1996)MATHMathSciNetCrossRefGoogle Scholar
  38. 38.
    Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenom. 60(1–4), 259–268 (1992)MATHCrossRefGoogle Scholar
  39. 39.
    Le, T., Chartrand, R., Asaki, T.J.: A variational approach to reconstructing images corrupted by Poisson noise. J. Math. Imaging Vis 27, 257–263 (2007)MathSciNetCrossRefGoogle Scholar
  40. 40.
    Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 89–97 (2004)MathSciNetCrossRefGoogle Scholar
  41. 41.
    Goldstein, T., Osher, S.: The split Bregman method for \(l_1\)-regularized problems. SIAM J. Img. Sci. 2, 323–343 (2009)MATHMathSciNetCrossRefGoogle Scholar
  42. 42.
    Zhou, H., Armagan, A., Dunson, D.: Path following and empirical Bayes model selection for sparse regressions. arXiv:1201.3528 (2012)
  43. 43.
    Xiao, W., Wu, Y., Zhou, H.: ConvexLAR: an extension of least angle regression. J. Comput. Gr. Stat. Vol. (in press) (2015)Google Scholar
  44. 44.
    Zhou, H., Lange, K.: On the bumpy road to the dominant mode. Scand. J. Stat. 37(4), 612–631 (2010)MATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of StatisticsNorth Carolina State UniversityRaleighUSA
  2. 2.Departments of BiomathematicsHuman Genetics and Statistics, University of CaliforniaLos AngelesUSA

Personalised recommendations