Block-Coordinate Primal-Dual Method for Nonsmooth Minimization over Linear Constraints

Part of the Lecture Notes in Mathematics book series (LNM, volume 2227)


We consider the problem of minimizing a convex, separable, nonsmooth function subject to linear constraints. The numerical method we propose is a block-coordinate extension of the Chambolle-Pock primal-dual algorithm. We prove convergence of the method without resorting to assumptions like smoothness or strong convexity of the objective, full-rank condition on the matrix, strong duality or even consistency of the linear system. Freedom from imposing the latter assumption permits convergence guarantees for misspecified or noisy systems.


Saddle-point problems First order algorithms Primal-dual algorithms Coordinate methods Randomized methods 

AMS Subject Classifications

49M29 65K10 65Y20 90C25 



This research was supported by the German Research Foundation grant SFB755-A4.


  1. 1.
    H. Attouch, J. Bolte, P. Redont, A. Soubeyran, Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Lojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)MathSciNetCrossRefGoogle Scholar
  2. 2.
    S. Banert, R.I. Bot, E.R. Csetnek, Fixing and extending some recent results on the ADMM algorithm. arXiv:1612.05057 (2016, Preprint)Google Scholar
  3. 3.
    D.P. Bertsekas, J.N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods (Prentice-Hall, Upper Saddle River, 1989). ISBN: 0-13-648700-9zbMATHGoogle Scholar
  4. 4.
    P. Bianchi, W. Hachem, I. Franck, A stochastic coordinate descent primal-dual algorithm and applications, in 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP) (IEEE, Piscataway, 2014), pp. 1–6Google Scholar
  5. 5.
    J. Bolte, S. Sabach, M. Teboulle, Proximal alternating linearized minimization for non-convex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)MathSciNetCrossRefGoogle Scholar
  6. 6.
    S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)CrossRefGoogle Scholar
  7. 7.
    E.J. Candès, M.B. Wakin, An introduction to compressive sampling. IEEE Signal Process. Mag. 25(2), 21–30 (2008)CrossRefGoogle Scholar
  8. 8.
    E.J. Candès, J.K. Romberg, T. Tao, Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59(8), 1207–1223 (2006)MathSciNetCrossRefGoogle Scholar
  9. 9.
    E.J. Candès, X. Li, Y Ma, J. Wright, Robust principal component analysis? J. ACM 58(3), 11 (2011)Google Scholar
  10. 10.
    A. Chambolle, T. Pock, A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imag. Vis. 40(1), 120–145 (2011)MathSciNetCrossRefGoogle Scholar
  11. 11.
    S.S. Chen, Basis pursuit, Ph.D. thesis, Department of Statistics, Stanford University Stanford, 1995Google Scholar
  12. 12.
    S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001)MathSciNetCrossRefGoogle Scholar
  13. 13.
    P.L. Combettes, J.-C. Pesquet, Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping. SIAM J. Optim. 25(2), 1221–1248 (2015)MathSciNetCrossRefGoogle Scholar
  14. 14.
    A. Defazio, F. Bach, S. Lacoste-Julien, SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives, in Advances in Neural Information Processing Systems (2014), pp. 1646–1654Google Scholar
  15. 15.
    J.C. Duchi, A. Agarwal, M.J. Wainwright, Dual averaging for distributed optimization: convergence analysis and network scaling. IEEE Trans. Autom. Control 57(3), 592–606 (2012)MathSciNetCrossRefGoogle Scholar
  16. 16.
    J. Eckstein, Some saddle-function splitting methods for convex programming. Optim. Methods Softw. 4(1), 75–83 (1994)CrossRefGoogle Scholar
  17. 17.
    O. Fercoq, P. Bianchi, A coordinate descent primal-dual algorithm with large step size and possibly non separable functions. arXiv:1508.04625 (2015, Preprint)Google Scholar
  18. 18.
    O. Fercoq, P. Richtaórik, Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25(4), 1997–2023 (2015)MathSciNetCrossRefGoogle Scholar
  19. 19.
    R. Glowinski, A. Marroco, Sur l’approximation, par elements finis d’ordre un, et las resolution, par penalisation-dualite‘ d’une classe de problemes de dirichlet non lineares. Revue Francais d’Automatique Informatique et Recherche Opeórationelle 9(R-2), 41–76 (1975)CrossRefGoogle Scholar
  20. 20.
    B. He, X. Yuan, Convergence analysis of primal-dual algorithms for a saddle-point problem: from contraction perspective. SIAM J. Imag. Sci. 5(1), 119–149 (2012)MathSciNetCrossRefGoogle Scholar
  21. 21.
    R. Hesse, D.R. Luke, S. Sabach, M.K. Tam, Proximal heterogeneous block implicit-explicit method and application to blind ptychographic diffraction imaging. SIAM J. Imag. Sci. 8(1), 426–457 (2015)MathSciNetCrossRefGoogle Scholar
  22. 22.
    F. Iutzeler, P. Bianchi, P. Ciblat, W. Hachem, Asynchronous distributed optimization using a randomized alternating direction method of multipliers, in 2013 IEEE 52nd Annual Conference on Decision and Control (CDC) (IEEE, Piscataway, 2013), pp. 3671–3676Google Scholar
  23. 23.
    R. Johnson, T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, in Advances in Neural Information Processing Systems (2013), pp. 315–323Google Scholar
  24. 24.
    P. Latafat, N.M. Freris, P. Patrinos, A new randomized block-coordinate primal-dual proximal algorithm for distributed optimization. arXiv:1706.02882 (2017, Preprint)Google Scholar
  25. 25.
    D. Leventhal, A.S. Lewis, Randomized methods for linear constraints: convergence rates and conditioning. Math. Oper. Res. 35(3), 641–654 (2010)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Z. Lin, M. Chen, Y. Ma, The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv:1009.5055 (2010, Preprint)Google Scholar
  27. 27.
    D.A. Lorenz, F. Schoüpfer, S. Wenger, The linearized Bregman method via split feasibility problems: analysis and generalizations. SIAM J. Imag. Sci. 7(2), 1237–1262 (2014)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Y. Malitsky, The primal-dual hybrid gradient method reduces to a primal method for linearly constrained optimization problems. arXiv:1706.02602 (2017, Preprint)Google Scholar
  29. 29.
    Y. Malitsky, T. Pock, A first-order primal-dual algorithm with linesearch. SIAM J. Optim. 28(1), 411–432 (2018)MathSciNetCrossRefGoogle Scholar
  30. 30.
    A. Nedic, A. Ozdaglar, P.A. Parrilo, Constrained consensus and optimization in multi-agent networks. IEEE Trans. Autom. Control 55(4), 922–938 (2010)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Y. Nesterov, Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)MathSciNetCrossRefGoogle Scholar
  32. 32.
    B. Palaniappan, F. Bach, Stochastic variance reduction methods for saddle-point problems, in Advances in Neural Information Processing Systems (2016), pp. 1416–1424Google Scholar
  33. 33.
    F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Z. Peng, Y. Xu, M. Yan, W. Yin, Arock: an algorithmic framework for asynchronous parallel coordinate updates. SIAM J. Sci. Comput. 38(5), A2851–A2879 (2016)MathSciNetCrossRefGoogle Scholar
  35. 35.
    T. Pock, A. Chambolle, Diagonal preconditioning for first order primal-dual algorithms in convex optimization, in 2011 IEEE International Conference on Computer Vision (ICCV) (IEEE, Piscataway, 2011), pp. 1762–1769Google Scholar
  36. 36.
    G. Poólya, G. Szegoü, Problems and Theorems in Analysis I. : Series. Integral Calculus Theory of Functions (Springer, Berlin, 1978)Google Scholar
  37. 37.
    F. Santambrogio, Optimal Transport for Applied Mathematicians (Birkaüuser, New York, 2015)CrossRefGoogle Scholar
  38. 38.
    F. Schoüpfer, D.A. Lorenz, Linear convergence of the randomized sparse Kaczmarz method. arXiv:1610.02889 (2016, Preprint)Google Scholar
  39. 39.
    R. Shefi, M. Teboulle, Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization. SIAM J. Optim. 24(1), 269–297 (2014)MathSciNetCrossRefGoogle Scholar
  40. 40.
    M. Solodov, An explicit descent method for bilevel convex optimization. J. Convex Anal. 14(2), 227 (2007)Google Scholar
  41. 41.
    T. Strohmer, R. Vershynin, A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl. 15(2), 262–278 (2009)MathSciNetCrossRefGoogle Scholar
  42. 42.
    P. Tseng, On accelerated proximal gradient methods for convex-concave optimization (2008),
  43. 43.
    S.J. Wright, Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015)MathSciNetCrossRefGoogle Scholar
  44. 44.
    J. Wright, A. Ganesh, S. Rao, Y. Peng, Y. Ma, Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization, in Advances in Neural Information Processing Systems (2009)Google Scholar
  45. 45.
    X. Yuan, J. Yang, Sparse and low-rank matrix decomposition via alternating direction methods. Pac. J. Optim. 9, 167–180 (2013)MathSciNetzbMATHGoogle Scholar
  46. 46.
    Y. Zhang, X. Lin, Stochastic primal-dual coordinate method for regularized empirical risk minimization, in Proceedings of the 32nd International Conference on Machine Learning (ICML-15) (2015), pp. 353–361Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Institute for Numerical and Applied MathematicsUniversity of GöttingenGöttingenGermany

Personalised recommendations