Journal of Scientific Computing

, Volume 68, Issue 2, pp 546–572 | Cite as

Alternating Proximal Gradient Method for Convex Minimization

  • Shiqian MaEmail author


In this paper, we apply the idea of alternating proximal gradient to solve separable convex minimization problems with three or more blocks of variables linked by some linear constraints. The method proposed in this paper is to firstly group the variables into two blocks, and then apply a proximal gradient based inexact alternating direction method of multipliers to solve the new formulation. The main computational effort in each iteration of the proposed method is to compute the proximal mappings of the involved convex functions. The global convergence result of the proposed method is established. We show that many interesting problems arising from machine learning, statistics, medical imaging and computer vision can be solved by the proposed method. Numerical results on problems such as latent variable graphical model selection, stable principal component pursuit and compressive principal component pursuit are presented.


Alternating direction method of multipliers Proximal gradient method Global convergence Sparse and low-rank optimization 

Mathematics Subject Classification

65K05 90C25 49M27 



The author is grateful to Professor Wotao Yin for reading an earlier version of this paper and for valuable suggestions and comments. The author thanks the associate editor and two anonymous referees for their constructive comments that have helped improve the presentation of this paper greatly.


  1. 1.
    Aybat, N.S., Iyengar, G.: An alternating direction method with increasing penalty for stable principal component pursuit. Comput. Optim. Appl. 61, 635–668 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Banerjee, O., El Ghaoui, L., d’Aspremont, A.: Model selection through sparse maximum likelihood estimation for multivariate Gaussian for binary data. J. Mach. Learn. Res. 9, 485–516 (2008)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and distributed computation: numerical methods. Prentice-Hall Inc, Upper Saddle River (1989)zbMATHGoogle Scholar
  4. 4.
    Boley, D.: Local linear convergence of the alternating direction method of multipliers on quadratic or linear programs. SIAM J. Optim. 23, 2183–2207 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3, 1–122 (2011)CrossRefzbMATHGoogle Scholar
  6. 6.
    Cai, J., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20, 1956–1982 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58, 1–37 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9, 717–772 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52, 489–509 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Candès, E.J., Tao, T.: The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inform. Theory 56, 2053–2080 (2009)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Chandrasekaran, V., Parrilo, P.A., Willsky, A.S.: Latent variable graphical model selection via convex optimization. Ann. Stat. 40, 1935–1967 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Chandrasekaran, V., Sanghavi, S., Parrilo, P., Willsky, A.: Rank-sparsity incoherence for matrix decomposition. SIAM J. Optim. 21, 572–596 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Chen, C., He, B., Ye, Y., Yuan, X.: The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent. Math. Program. (2014). doi: 10.1007/s10107-014-0826-5
  14. 14.
    Chen, G., Teboulle, M.: A proximal-based decomposition method for convex minimization problems. Math. Program. 64, 81–101 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Combettes, P.L., Pesquet, J.-C.: A Douglas–Rachford splitting approach to nonsmooth convex variational signal recovery. IEEE J. Sel. Topics Signal Proc. 1, 564–574 (2007)CrossRefGoogle Scholar
  16. 16.
    Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. SIAM J. Multiscale Model. Simul. 4, 1168–1200 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    d’Aspremont, A., El Ghaoui, L., Jordan, M.I., Lanckriet, G.R.G.: A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49, 434–448 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Deng, W., Lai, M., Peng, Z., Yin, W.: Parallel multi-block ADMM with \(o(1/k)\) convergence. tech. report, UCLA CAM, 13–64 (2013)Google Scholar
  19. 19.
    Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. J. Sci. Comput. (2015)Google Scholar
  20. 20.
    Donoho, D.: Compressed sensing. IEEE Trans. Inform. Theory 52, 1289–1306 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Douglas, J., Rachford, H.H.: On the numerical solution of the heat conduction problem in 2 and 3 space variables. Trans. Am. Math. Soc. 82, 421–439 (1956)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Eckstein, J.: Splitting methods for monotone operators with applications to parallel optimization. PhD thesis, Massachusetts Institute of Technology (1989)Google Scholar
  23. 23.
    Eckstein, J.: Some saddle-function splitting methods for convex programming. Optim. Methods Softw. 4, 75–83 (1994)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Eckstein, J., Bertsekas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55, 293–318 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Fazel, M., Pong, T., Sun, D., Tseng, P.: Hankel matrix rank minimization with applications to system identification and realization. SIAM J. Matrix Anal. Appl. 34, 946–977 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441 (2008)CrossRefzbMATHGoogle Scholar
  27. 27.
    Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Augmented Lagrangian Methods: Applications to the Solution of Boundary Value Problems, M. Fortin and R. Glowinski (eds.), North-Hollan, Amsterdam (1983)Google Scholar
  28. 28.
    Glowinski, R.: Numerical methods for nonlinear variational problems. Springer, Berlin (1984)CrossRefzbMATHGoogle Scholar
  29. 29.
    Glowinski, R., Le Tallec, P.: Augmented lagrangian and operator-splitting methods in nonlinear mechanics. SIAM, Philadelphia (1989)CrossRefzbMATHGoogle Scholar
  30. 30.
    Goldfarb, D., Ma, S.: Fast multiple splitting algorithms for convex optimization. SIAM J. Optim. 22, 533–556 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Goldfarb, D., Ma, S., Scheinberg, K.: Fast alternating linearization methods for minimizing the sum of two convex functions. tech. report, Department of IEOR, Columbia University. Preprint available at arXiv:0912.4571, (2010)
  32. 32.
    Goldfarb, D., Ma, S., Scheinberg, K.: Fast alternating linearization methods for minimizing the sum of two convex functions. Math. Program. 141, 349–382 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Goldstein, T., O’Donoghue, B., Setzer, S., Baraniuk, R.: Fast alternating direction optimization methods. SIAM J. Imaging Sci. 7, 1588–1623 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Goldstein, T., Osher, S.: The split Bregman method for L1-regularized problems. SIAM J. Imaging Sci. 2, 323–343 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Han, D., Yuan, X.: Local linear convergence of the alternating direction method of multipliers for quadratic programs. SIAM J. Numer. Anal. 51, 3446–3457 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    He, B., Hou, L., Yuan, X.: On full Jacobian decomposition of the augmented Lagrangian method for separable convex programming. SIAM J. Optim. 25, 2274–2312 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    He, B., Tao, M., Yuan, X.: Alternating direction method with Gaussian back substitution for separable convex programming. SIAM J. Optim. 22, 313–340 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    He, B., Yuan, X.: On the \({O}(1/n)\) convergence rate of Douglas-Rachford alternating direction method. SIAM J. Numer. Anal. 50, 700–709 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    He, B., Yuan, X.: On nonergodic convergence rate of Douglas-Rachford alternating direction method of multipliers. Numer. Math. 130, 567–577 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    He, B.S., Liao, L., Han, D., Yang, H.: A new inexact alternating direction method for monotone variational inequalities. Math. Program. 92, 103–118 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    He, B.S., Tao, M., Yuan, X.M.: Convergence rate and iteration complexity on the alternating direction method of multipliers with a substitution procedure for separable convex programming (2012).
  42. 42.
    Hong, M., Chang, T.-H., Wang, X., Razaviyayn, M., Ma, S., Luo, Z.-Q.: A block successive upper bound minimization method of multipliers for linearly constrained convex optimization (2014). arXiv:1401.7079
  43. 43.
    Hong, M., Luo, Z.: On the linear convergence of the alternating direction method of multipliers. Preprint (2012). arXiv:1208.3922
  44. 44.
    Keshavan, R.H., Montanari, A., Oh, S.: Matrix completion from a few entries. IEEE Trans. Info. Theory 56, 2980–2998 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  45. 45.
    Lauritzen, S.: Graphical models. Oxford University Press, Oxford (1996)zbMATHGoogle Scholar
  46. 46.
    Li, L., Toh, K.-C.: An inexact interior point method for \(l_1\)-regularized sparse covariance selection. Math. Program. Comput. 2, 291–315 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Lin, T., Ma, S., Zhang, S.: An extragradient-based alternating direction method for convex minimization. to appear in Foundations of Computational Mathematics (2015)Google Scholar
  48. 48.
    Lin, T., Ma, S., Zhang, S.: On the global linear convergence of the ADMM with multi-block variables. SIAM J. Optim. 25, 1478–1497 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  49. 49.
    Lin, T., Ma, S., Zhang, S.: On the sublinear convergence rate of multi-block ADMM. J. Oper. Res. Soc. China 3, 251–274 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  50. 50.
    Lin, Z., Liu, R., Su, Z.: Linearized alternating direction method with adaptive penalty for low rank representation. in NIPS (2011)Google Scholar
  51. 51.
    Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  52. 52.
    Liu, R., Lin, Z.: Linearized alternating direction method with parallel splitting and adaptive penalty for separable convex programs in machine learning. in ACML (2013)Google Scholar
  53. 53.
    Lustig, M., Donoho, D., Pauly, J.: Sparse MRI: the application of compressed sensing for rapid MR imaging. Magn. Reson. Med. 58, 1182–1195 (2007)CrossRefGoogle Scholar
  54. 54.
    Ma, S.: Alternating direction method of multipliers for sparse principal component analysis. J. Oper. Res. Soc. China 1, 253–274 (2013)CrossRefzbMATHGoogle Scholar
  55. 55.
    Ma, S., Goldfarb, D., Chen, L.: Fixed point and Bregman iterative methods for matrix rank minimization. Math. Program. Series A 128, 321–353 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  56. 56.
    Ma, S., Xue, L., Zou, H.: Alternating direction methods for latent variable gaussian graphical model selection. Neural Comput. 25, 2172–2198 (2013)MathSciNetCrossRefGoogle Scholar
  57. 57.
    Ma, S., Yin, W., Zhang, Y., Chakraborty, A.: An efficient algorithm for compressed MR imaging using total variation and wavelets. in CVPR, pp. 1–8 (2008)Google Scholar
  58. 58.
    Malick, J., Povh, J., Rendl, F., Wiegele, A.: Regularization methods for semidefinite programming. SIAM J. Optim. 20, 336–356 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  59. 59.
    Monteiro, R.D.C., Svaiter, B.F.: Iteration-complexity of block-decomposition algorithms and the alternating direction method of multipliers. SIAM J. Optim. 23, 475–507 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  60. 60.
    Parikh, N., Boyd, S.: Block splitting for large-scale distributed learning. in NIPS (2011)Google Scholar
  61. 61.
    Peng, Y., Ganesh, A., Wright, J., Xu, W., Ma, Y.: RASL: robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2233–2246 (2012)CrossRefGoogle Scholar
  62. 62.
    Qin, Z., Goldfarb, D., Ma, S.: An alternating direction method for total variation denoising. Optim. Methods Softw. 30, 594–615 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  63. 63.
    Recht, B., Fazel, M., Parrilo, P.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52, 471–501 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  64. 64.
    Scheinberg, K., Ma, S., Goldfarb, D.: Sparse inverse covariance selection via alternating linearization methods. in NIPS (2010)Google Scholar
  65. 65.
    Scheinberg, K., Rish, I.: Learning sparse Gaussian Markov networks using a greedy coordinate ascent approach. In: Proceedings of the 2010 European conference on machine learning and knowledge discovery in databases: Part III, ECML PKDD’10, Berlin, Heidelberg, Springer-Verlag, pp. 196–212 (2010)Google Scholar
  66. 66.
    Tao, M., Yuan, X.: Recovering low-rank and sparse components of matrices from incomplete and noisy observations. SIAM J. Optim. 21, 57–81 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  67. 67.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc. B. 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  68. 68.
    Wang, C., Sun, D., Toh, K.-C.: Solving log-determinant optimization problems by a Newton-CG primal proximal point algorithm. SIAM J. Optim. 20, 2994–3013 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  69. 69.
    Wang, X., Hong, M., Ma, S., Luo, Z.-Q.: Solving multiple-block separable convex minimization problems using two-block alternating direction method of multipliers. Pacific J. Optim. 11, 645–667 (2015)MathSciNetzbMATHGoogle Scholar
  70. 70.
    Wang, Y., Yang, J., Yin, W., Zhang, Y.: A new alternating minimization algorithm for total variation image reconstruction. SIAM J. Imaging Sci. 1, 248–272 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  71. 71.
    Wen, Z., Goldfarb, D., Yin, W.: Alternating direction augmented Lagrangian methods for semidefinite programming. Math. Program. Comput. 2, 203–230 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  72. 72.
    Wright, J., Ganesh, A., Min, K., Ma, Y.: Compressive principal component pursuit. Inf. Inference J. IMA 2, 32–68 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  73. 73.
    Xue, L., Ma, S., Zou, H.: Positive definite \(\ell _1\) penalized estimation of large covariance matrices. J. Am. Stat. Assoc. 107, 1480–1491 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  74. 74.
    Yang, J., Yuan, X.: Linearized augmented lagrangian and alternating direction methods for nuclear norm minimization. Math. Comput. 82, 301–329 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  75. 75.
    Yang, J., Zhang, Y.: Alternating direction algorithms for \(\ell _1\) problems in compressive sensing. SIAM J. Sci. Comput. 33, 250–278 (2011)MathSciNetCrossRefGoogle Scholar
  76. 76.
    Yang, J., Zhang, Y., Yin, W.: A fast TVL1-L2 algorithm for image reconstruction from partial fourier data. IEEE J. Sel. Top. Signal Proces. Spec. Issue Compress. Sens. 4, 288–297 (2010)CrossRefGoogle Scholar
  77. 77.
    Yuan, M., Lin, Y.: Model selection and estimation in the Gaussian graphical model. Biometrika 94, 19–35 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  78. 78.
    Yuan, X.: Alternating direction methods for sparse covariance selection. J. Sci. Comput. 51, 261–273 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  79. 79.
    Zhang, X., Burger, M., Bresson, X., Osher, S.: Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAM J. Imaging Sci. 3, 253–276 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  80. 80.
    Zhou, Z., Li, X., Wright, J., Candès, E.J., Ma, Y.: Stable principal component pursuit. Proceedings of International Symposium on Information Theory (2010)Google Scholar
  81. 81.
    Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15, 265–286 (2006)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Systems Engineering and Engineering ManagementThe Chinese University of Hong KongShatinHong Kong

Personalised recommendations