Journal of Optimization Theory and Applications

, Volume 179, Issue 2, pp 676–695 | Cite as

Stochastic Accelerated Alternating Direction Method of Multipliers with Importance Sampling

  • Chenxi ChenEmail author
  • Yunmei Chen
  • Yuyuan Ouyang
  • Eduardo Pasiliao


In this paper, we incorporate importance sampling strategy into accelerated framework of stochastic alternating direction method of multipliers for solving a class of stochastic composite problems with linear equality constraint. The rates of convergence for primal residual and feasibility violation are established. Moreover, the estimation of variance of stochastic gradient is improved due to the use of important sampling. The proposed algorithm is capable of dealing with the situation, where the feasible set is unbounded. The experimental results indicate the effectiveness of the proposed method.


Stochastic ADMM Duality gap Variance estimation Importance sampling 

Mathematics Subject Classification

90C06 90C25 90C30 



Funding was provided by National Science Foundation (Grant No. DMS 1719932).


  1. 1.
    Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)CrossRefGoogle Scholar
  2. 2.
    Azadi, S., Sra, S.: Towards an optimal stochastic alternating direction method of multipliers. In: Proceedings of the 31st ICML, pp. 620–628 (2014)Google Scholar
  3. 3.
    Ouyang, H., He, N., Tran, L., Gray, A.: Stochastic alternating direction method of multipliers. In: Proceedings of the 30th ICML, pp. 80–88 (2013)Google Scholar
  4. 4.
    Suzuki, T.: Dual averaging and proximal gradient descent for online alternating direction multiplier method. In: Proceedings of the 30th ICML, pp. 392–400 (2013)Google Scholar
  5. 5.
    Wang, H., Banerjee, A.: Online alternating direction method. arXiv preprint arXiv:1306.3721 (2013)
  6. 6.
    Strohmer, T., Vershynin, R.: A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl. 15(2), 262–278 (2009)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Needell, D., Srebro, N., Ward, R.: Stochastic gradient descent and the randomized Kaczmarz algorithm. arXiv preprint arXiv:1310.5715 (2013)
  9. 9.
    Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162, 83–112 (2013)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Zhao, P., Zhang, T.: Stochastic optimization with importance sampling for regularized loss minimization. In: International Conference on Machine Learning, pp. 1–9 (2015)Google Scholar
  11. 11.
    Chen, Y., Lan, G., Ouyang, Y.: Optimal primal-dual methods for a class of saddle point problems. SIAM J. Optim. 24(4), 1779–1814 (2014)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th ICML, pp. 433–440. ACM (2009)Google Scholar
  13. 13.
    Tomioka, R., Hayashi, K., Kashima, H.: Estimation of low-rank tensors via convex optimization. arXiv preprint arXiv:1010.0789 (2010)
  14. 14.
    Goldstein, T., Osher, S.: The split bregman method for l1-regularized problems. SIAM J. Imaging Sci. 2(2), 323–343 (2009)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Goldfarb, D., Ma, S., Scheinberg, K.: Fast alternating linearization methods for minimizing the sum of two convex functions. Math. Program. 141, 349–382 (2013)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Touzi, N.: Stochastic control and application to finance. Pisa. Special Research Semester on Financial Mathematics, Scuola Normale Superiore (2002)Google Scholar
  17. 17.
    Ziemba, W.T., Vickson, R.G.: Stochastic Optimization Models in Finance. World Scientific, Singapore (1975)zbMATHGoogle Scholar
  18. 18.
    Nesterov, Y.: A method of solving a convex programming problem with convergence rate o (1/k2). Sov. Math. Doklady 27(2), 372–376 (1983)zbMATHGoogle Scholar
  19. 19.
    Nesterov, Y.: Introductory Lectures on Convex Optimization, vol. 87. Springer, Berlin (2004)zbMATHGoogle Scholar
  20. 20.
    Zhao, S.Y., Li, W.J., Zhou, Z.H.: Scalable stochastic alternating direction method of multipliers. arXiv preprint arXiv:1502.03529 (2015)
  21. 21.
    Zhang, C., Shen, Z., Qian, H., Zhou, T.: Accelerated stochastic ADMM with variance reduction. arXiv preprint arXiv:1611.04074 (2016)
  22. 22.
    Zheng, S., Kwok, J.T.: Stochastic variance-reduced ADMM. arXiv preprint arXiv:1604.07070 (2016)
  23. 23.
    Liu, Y., Shang, F., Cheng, J.: Accelerated variance reduced stochastic ADMM. In: AAAI, pp. 2287–2293 (2017)Google Scholar
  24. 24.
    Shapiro, A.: Monte carlo sampling methods. Handb. Oper. Res. Manag. Sci. 10, 353–425 (2003)MathSciNetGoogle Scholar
  25. 25.
    Shapiro, A., Nemirovski, A.: On complexity of stochastic programming problems. In: Continuous Optimization, pp. 111–146. Springer (2005)Google Scholar
  26. 26.
    Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)CrossRefGoogle Scholar
  28. 28.
    Lan, G., Nemirovski, A., Shapiro, A.: Validation analysis of mirror descent stochastic approximation method. Math. Program. 134(2), 425–458 (2012)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Esser, E., Zhang, X., Chan, T.F.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–1046 (2010)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Moreau, J.J.: Décomposition orthogonale dun espace hilbertien selon deux cônes mutuellement polaires. CR Acad. Sci. Paris 255, 238–240 (1962)zbMATHGoogle Scholar
  31. 31.
    Ouyang, Y., Chen, Y., Lan, G., Pasiliao Jr, E.: An accelerated linearized alternating direction method of multipliers. arXiv preprint arXiv:1401.6607 (2014)
  32. 32.
    Banerjee, O., Ghaoui, L.E., dAspremont, A.: Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. J. Mach. Learn. Res. 9(Mar), 485–516 (2008)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of FloridaGainesvilleUSA
  2. 2.Department of Mathematical SciencesClemson UniversityClemsonUSA
  3. 3.Munitions Directorate, Air Force Research LaboratoryAFBEglinUSA

Personalised recommendations