Advertisement

Journal of Scientific Computing

, Volume 79, Issue 2, pp 671–699 | Cite as

Accelerated Alternating Direction Method of Multipliers: An Optimal O(1 / K) Nonergodic Analysis

  • Huan Li
  • Zhouchen LinEmail author
Review Paper
  • 103 Downloads

Abstract

The Alternating Direction Method of Multipliers (ADMM) is widely used for linearly constrained convex problems. It is proven to have an \(o(1{/}\sqrt{K})\) nonergodic convergence rate and a faster O(1 / K) ergodic rate after ergodic averaging, where K is the number of iterations. Such nonergodic convergence rate is not optimal. Moreover, the ergodic averaging may destroy the sparseness and low-rankness in sparse and low-rank learning. In this paper, we modify the accelerated ADMM proposed in Ouyang et al. (SIAM J. Imaging Sci. 7(3):1588–1623, 2015) and give an O(1 / K) nonergodic convergence rate analysis, which satisfies \(|F(\mathbf {x}^K)-F(\mathbf {x}^*)|\le O(1{/}K)\), \(\Vert \mathbf {A}\mathbf {x}^K-\mathbf {b}\Vert \le O(1{/}K)\) and \(\mathbf {x}^K\) has a more favorable sparseness and low-rankness than the ergodic peer, where \(F(\mathbf {x})\) is the objective function and \(\mathbf {A}\mathbf {x}=\mathbf {b}\) is the linear constraint. As far as we know, this is the first O(1 / K) nonergodic convergent ADMM type method for the general linearly constrained convex problems. Moreover, we show that the lower complexity bound of ADMM type methods for the separable linearly constrained nonsmooth convex problems is O(1 / K), which means that our method is optimal.

Keywords

Accelerated Alternating Direction Method of Multipliers O(1/K) nonergodic convergence rate O(1/K) lower complexity bound 

Notes

Acknowledgements

Zhouchen Lin is supported by National Basic Research Program of China (973 Program) (Grant No. 2015CB352502), National Natural Science Foundation (NSF) of China (Grant Nos. 61625301 and 61731018), Qualcomm and Microsoft Research Asia.

Supplementary material

10915_2018_893_MOESM1_ESM.pdf (278 kb)
Supplementary material 1 (pdf 278 KB)

References

  1. 1.
    Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein. J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. In: Foundations and Trends\(_{\textregistered }\) in Machine Learning, pp. 1–122 (2011)Google Scholar
  2. 2.
    Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Esser, E., Zhang, X., Chan, T.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–1046 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    He, B., Liao, L., Han, D., Yang, H.: A new inexact alternating directions method for monotone variational inequalities. Math. Program. 92(1), 103–118 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Shefi, R., Teboulle, M.: Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization. SIAM J. Optim. 24(1), 269–297 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Wang, X., Yuan, X.: The linearized alternating direction method for Dantzig selector. SIAM J. Sci. Comput. 34(5), A2792–A2811 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    He, B., Yuan, X.: On the \({O}(1/t)\) convergence rate of the Douglas–Rachford alternating direction method. SIAM J. Numer. Anal. 50, 700–709 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    He, B., Yuan, X.: On non-ergodic convergence rate of Douglas–Rachford alternating directions method of multipliers. Numer. Math. 130, 567–577 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Davis, D., Yin, W.: Convergence rate analysis of several splitting schemes. Technical report, UCLA CAM Report (2014)Google Scholar
  10. 10.
    Douglas, J., Rachford, H.: On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soc. 82(2), 421–439 (1956)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Gabay, D.: Applications of the method of multipliers to variational inequalities. Stud. Math. Appl. 15, 299–331 (1983)CrossRefGoogle Scholar
  12. 12.
    Beck, A., Teboulle, M.: A fast iterative shrinkage thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Nesterov, Yu.: A method for unconstrained convex minimization problem with the rate of convergence \({O}(1/k^2)\). Sov. Math. Dokl. 27(2), 372–376 (1983)Google Scholar
  14. 14.
    Nesterov, Yu.: On an approach to the construction of optimal methods of minimization of smooth convex functions. Èkon. Mat. Metody 24(3), 509–517 (1988)zbMATHGoogle Scholar
  15. 15.
    Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization. Technical report, University of Washington, Seattle (2008)Google Scholar
  16. 16.
    Chen, C., Chan, R., Ma, S., Yang, J.: Inertial proximal ADMM for linearly constrained separable convex optimization. SIAM J. Imaging Sci. 8(4), 2239–2267 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Lorenz, D., Pock, T.: An inertial forward–backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51(2), 311–325 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Ouyang, Y., Chen, Y., Lan, G., Pasiliao, E.: An accelerated linearized alternating direction method of multipliers. SIAM J. Imaging Sci. 7(3), 1588–1623 (2015)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Goldstein, T., O’Donoghue, B., Setzer, S., Baraniuk, R.: Fast alternating direction optimization methods. SIAM J. Imaging Sci. 7(3), 1588–1623 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. J. Sci. Comput. 66(3), 889–916 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Hong, M., Luo, Z.: On the linear convergence of the alternating direction method of multipliers. Math. Program. 162(1–2), 165–199 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Giselsson, P., Boyd, S.: Linear convergence and metric selection in douglas rachford splitting and ADMM. IEEE Trans. Autom. Control 62(2), 532–544 (2017)CrossRefzbMATHGoogle Scholar
  23. 23.
    Yang, W., Han, D.: Linear convergence of the alternating direction method of multipliers for a class of convex optimization problems. SIAM J. Numer. Anal. 54(2), 625–640 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Boley, D.: Local linear convergence of the alternating direction methodof multipliers on quadratic or linear programs. SIAM J. Optim. 23(4), 2183–2207 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Chen, Y., Lan, G., Ouyang, Y.: Optimal primal-dual methods for a class of saddle point problems. SIAM J. Optim. 24(4), 1779–1814 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Donoho, D.: De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41(3), 613–627 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Cai, J., Candès, E., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Lin, Z., Liu, R., Li, H.: Linearized alternating direction method with parallel splitting and adaptive penalty for separable convex programs in machine learning. Mach. Learn. 99(2), 287–325 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Ma, Y., Lin, Z., Chen, M.: The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices (2010). arXiv:1009.5055
  30. 30.
    O’Donoghue, B., Candès, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715–732 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Necoara, I., Nesterov, Yu., Glineur, F.: Linear convergence of first order methods for non-strongly convex optimization (2016). arXiv:1504.06298
  32. 32.
    Li, H., Lin, Z.: Provable accelerated gradient method for nonconvex low rank optimization (2017). arXiv:1702.04959
  33. 33.
    Bauschke, H., Bello Cruz, J., Nghia, T., Phan, H., Wang, X.: The rate of linear convergence of the Douglas Rachford algorithm for subspaces is the cosine of the Friedrichs angle. J. Approx. Theory 185, 63–79 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Woodworth, B., Srebro, N.: Tight complexity bounds for optimizing composite objectives. In: Advances in Neural Information Processing Systems (NIPS), pp. 3639–3647 (2016)Google Scholar
  35. 35.
    Meier, L., van de Geer, S., Bühlmann, P.: The group LASSO for logistic regression. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70(1), 53–71 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Jacob, L., Obozinski, G., Vert, J.: Group LASSO with overlap and graph LASSO. In: Proceedings of the 26th Annual International Conference on Machine Learning (ICML), pp. 433–440 (2009)Google Scholar
  37. 37.
    van de Vijver, M., He, Y., van’t Veer, L., Dai, H., et al.: A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347(25), 1999–2009 (2002)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Key Laboratory of Machine Perception (MOE), School of EECSPeking UniversityBeijingChina

Personalised recommendations