Asynchronous level bundle methods

Iutzeler, Franck; Malick, Jérôme; de Oliveira, Welington

doi:10.1007/s10107-019-01414-y

Asynchronous level bundle methods

Full Length Paper
Series A
Published: 12 July 2019

Volume 184, pages 319–348, (2020)
Cite this article

Mathematical Programming Submit manuscript

Franck Iutzeler¹,
Jérôme Malick² &
Welington de Oliveira³

605 Accesses
6 Citations
Explore all metrics

Abstract

In this paper, we consider nonsmooth convex optimization problems with additive structure featuring independent oracles (black-boxes) working in parallel. Existing methods for solving these distributed problems in a general form are synchronous, in the sense that they wait for the responses of all the oracles before performing a new iteration. In this paper, we propose level bundle methods handling asynchronous oracles. These methods require original upper-bounds (using upper-models or scarce coordinations) to deal with asynchronicity. We prove their convergence using variational-analysis techniques and illustrate their practical performance on a Lagrangian decomposition problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantile-based random sparse Kaczmarz for corrupted and noisy linear systems

Article 13 May 2024

Quantum bridge analytics I: a tutorial on formulating and using QUBO models

Article 07 April 2022

Parallelizing the dual revised simplex method

Article Open access 14 December 2017

Notes

To better underline our contributions on asynchronicity, we consider first only exact oracles of the \(f^i\) as above. Later in Sect. 5, we explain how our developments easily extend to the case of inexact oracles providing noisy approximations of \((f^i({x}), g^i)\).
Note that Algorithm 2 still needs initial bounds \(f^{{{\,\mathrm{up}\,}}}_1\) and \(f^{{{\,\mathrm{low}\,}}}_1\). These bounds can often be easily estimated from the data of the problem. Otherwise, we can use the standard initialization: call the m oracles at an initial point \(x_1\) and wait for their first responses from which we can compute \(f^{{{\,\mathrm{up}\,}}}_1 = f(x_1) = \sum _i f^i(x_1)\) and \(f^{{{\,\mathrm{low}\,}}}_1\) as the minimum of the linearization \(f(x_1) + {\langle g_1,x-x_1 \rangle }\) over the compact set X. If we do not want to have this synchronous initial step, we may alternatively estimate \(f^{{{\,\mathrm{lev}\,}}}\) and set \(f^{{{\,\mathrm{up}\,}}}_1=+\infty \) and \(f^{{{\,\mathrm{low}\,}}}_1=-\infty \). This would require small changes in the algorithm (in line 15) and in its proof (in Lemma 3). For sake of clarity we stick with the simplest version of the algorithm and the most frequent situation where we can easily estimate \(f^{{{\,\mathrm{up}\,}}}_1\) and \(f^{{{\,\mathrm{low}\,}}}_1\).
As the oracles are assumed to respond in a finite time, the inequality \(\min _{j=1,\ldots ,m} {{\mathtt {a}}(j)} \ge {\bar{k}}\) is guaranteed to be satisfied for k is large enough.
By substituting \(f^i_{x}= f^i({x}) - \eta _{x}^{v,i}\) in the inequality \(f^i(\cdot ) \ge f^i_{x}+ \langle g^i_{x}, \cdot -{x} \rangle -\eta _{x}^{s,i}\) and evaluating at x, we get that \(f(x)\ge f(x) - \eta _x^{v,i} - \eta _x^{s,i}\). This shows that \(\eta ^{v,i}_x+\eta ^{s,i}_x\ge 0\) and in fact \(g_x^i \in \partial _{(\eta ^{v,i}_x+\eta ^{v,i}_x)} f^i(x)\).
As in Sect. 3, \({{\mathtt {a}}(i)}\) is the iteration index of the anterior information provided by oracle i; see Algorithm 2.

References

Arda, A., Feyzmahdavian, H.R., Johansson, M.: Analysis and implementation of an asynchronous optimization algorithm for the parameter server (2016). arXiv preprint arXiv:1610.05507
Bacaud, L., Lemaréchal, C., Renaud, A., Sagastizábal, C.: Bundle methods in stochastic optimal power management: a disaggregated approach using preconditioners. Comput. Optim. Appl. 20, 227–244 (2001)
MathSciNet MATH Google Scholar
Bernardes, N.C.: On nested sequences of convex sets in Banach spaces. J. Math. Anal. Appl. 389, 558–561 (2012)
MathSciNet MATH Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods, vol. 23. Prentice Hall, Englewood Cliffs (1989)
MATH Google Scholar
Briant, O., Claude Lemaréchal, P., Meurdesoif, S.M., Perrot, N., Vanderbeck, F.: Comparison of bundle and classical column generation. Math. Programm. 113, 299–344 (2008)
MathSciNet MATH Google Scholar
Bruno, S.V.B., Moraes, L.A.M., de Oliveira, W.: Optimization techniques for the Brazilian natural gas network planning problem. Energy Syst. 8, 81–101 (2017)
Google Scholar
de Oliveira, W.: Target radius methods for nonsmooth convex optimization. Oper. Res. Lett. 45, 659–664 (2017)
MathSciNet MATH Google Scholar
de Oliveira, W., Sagastizábal, C.: Level bundle methods for oracles with on-demand accuracy. Optim. Methods Softw. 29, 1180–1209 (2014)
MathSciNet MATH Google Scholar
de Oliveira, W., Solodov, M.: Bundle methods for inexact data. Technical report (2018)
Dubost, L., Gonzalez, R., Lemaréchal, C.: A primal-proximal heuristic applied to the french unit-commitment problem. Math. Program. 104, 129–151 (2005)
MathSciNet MATH Google Scholar
Fischer, F., Helmberg, C.: A parallel bundle framework for asynchronous subspace optimization of nonsmooth convex functions. SIAM J. Optim. 24, 795–822 (2014)
MathSciNet MATH Google Scholar
Frangioni, A.: Standard bundle methods: untrusted models and duality. Technical report, Universita di Pisa (2018)
Frangioni, A., Gorgone, E.: Bundle methods for sum-functions with “easy” components: applications to multicommodity network design. Math. Program. 145, 133–161 (2014)
MathSciNet MATH Google Scholar
Geoffrion, A.M.: Generalized benders decomposition. J. Optim. Theory Appl. 10, 237–260 (1972)
Article MathSciNet Google Scholar
Hannah, R., Yin, W.: More iterations per second, same quality–why asynchronous algorithms may drastically outperform traditional ones (2017). arXiv preprint arXiv:1708.05136
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis Minimization Algorithms, vol. 305 and 306. Springer, Berlin (1993)
Book Google Scholar
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Advances in Neural Information Processing Systems, pp. 315–323 (2013)
Kim, K., Petra, C., Zavala, V.: An asynchronous bundle-trust-region method for dual decomposition of stochastic mixed-integer programming. SIAM J. Optim. 29, 318–342 (2019)
Article MathSciNet Google Scholar
Kiwiel, K.C.: Proximal level bubdle methods for convex nondiferentiable optimization, saddle-point problems and variational inequalities. Math. Program. 69, 89–109 (1995)
MATH Google Scholar
Konecnỳ, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: distributed machine learning for on-device intelligence (2016). arXiv preprint arXiv:1610.02527
Lemaréchal, C.: An extension of davidon methods to nondifferentiable problems. Math. Program. Study 3, 95–109 (1975)
MathSciNet MATH Google Scholar
Lemaréchal, C.: Lagrangian relaxation. In: Jünger, M., Naddef, D. (eds.) Computational Combinatorial Optimization: Optimal or Provably Near-Optimal Solutions, pp. 112–156. Springer, Berlin, Heidelberg (2001). https://doi.org/10.1007/3-540-45586-8_4
Lemaréchal, C., Nemirovskii, A., Nesterov, Y.: New variants of bundle methods. Math. Program. 69, 111–147 (1995)
MathSciNet MATH Google Scholar
Ma, C., Smith, V., Jaggi, M., Jordan, M., Richtarik, P., Takac, M.: Adding vs. averaging in distributed primal-dual optimization. In: International Conference on Machine Learning, pp. 1973–1982 (2015)
Malick, J., de Oliveira, W., Zaourar, S.: Uncontrolled inexact information within bundle methods. EURO J. Comput. Optim. 5, 5–29 (2017)
MathSciNet MATH Google Scholar
Mishchenko, K., Iutzeler, F., Malick, J., Amini, M.-R.: A delay-tolerant proximal-gradient algorithm for distributed learning. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, PMLR, 10–15, pp. 3584–3592 (Jul 2018)
Moritsch, H.W., Pflug, GCh., Siomak, M.: Asynchronous nested optimization algorithms and their parallel implementation. Wuhan Univ. J. Nat. Sci. 6(1–2), 560–567 (2001). https://doi.org/10.1007/BF03160302
Peng, Z., Yangyang, X., Yan, M., Yin, W.: Arock: an algorithmic framework for asynchronous parallel coordinate updates. SIAM J. Sci. Comput. 38, A2851–A2879 (2016)
MathSciNet MATH Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Heidelberg (1998)
MATH Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Scenarios and policy aggregation in optimization under uncertainty. Math. Oper. Res. 16, 119–147 (1991)
MathSciNet MATH Google Scholar
Sagastizábal, C.: Divide to conquer: decomposition methods for energy optimization. Math. Program. 134, 187–222 (2012)
MathSciNet MATH Google Scholar
Sagastizábal, C.: A VU-point of view of nonsmooth optimization. Proc. Int. Congr. Math. 3, 3785–3806 (2018)
Google Scholar
Shapiro, A., Dentcheva, D., Ruszczyński, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM, Bangkok (2009)
MATH Google Scholar
Smulian, V.: On the principle of inclusion in the space of the type \((b)\). Rec. Math. [Mat. Sbornik] N.S. 5(47), 317–328 (1939)
MathSciNet MATH Google Scholar
Sun, T., Hannah, R., Yin, W.:Asynchronous coordinate descent under more realistic assumption. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 6183–6191, Curran Associates Inc., Long Beach, California, USA (2017). http://dl.acm.org/citation.cfm?id=3295222.3295366
Tsitsiklis, J., Bertsekas, D., Athans, M.: Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Trans. Autom. Control 31, 803–812 (1986)
Article MathSciNet Google Scholar
van Ackooij, W., de Oliveira, W.: Level bundle methods for constrained convex optimization with various oracles. Comput. Optim. Appl. 57, 555–597 (2014)
Article MathSciNet Google Scholar
van Ackooij, W., Malick, J.: Decomposition algorithm for large-scale two-stage unit-commitment. Ann. Oper. Res. 238, 587–613 (2015)
Article MathSciNet Google Scholar
van Ackooij, W., Frangioni, A.: Incremental bundle methods using upper models. SIAM. J. Optimi. 28(1), 379–410 (2018). https://doi.org/10.1137/16M1089897
Wolf, C., Fábián, C.I., Koberstein, A., Suhl, L.: Applying oracles of on-demand accuracy in two-stage stochastic programming. A computational study. Eur. J. Oper. Res. 239, 437–448 (2014)
MathSciNet MATH Google Scholar
Zhang, R., Kwok, J.: Asynchronous distributed ADMM for consensus optimization. In: International Conference on Machine Learning, pp. 1701–1709 (2014)

Download references

Acknowledgements

We are grateful to the two referees for their rich feedback on the initial version of our paper. We would like to acknowledge the partial financial support of PGMO (Gaspard Monge Program for Optimization and operations research) of the Hadamard Mathematics Foundation, through the project “Advanced nonsmooth optimization methods for stochastic programming”.

Author information

Authors and Affiliations

Lab. J. Kuntzmann, UGA, Grenoble, France
Franck Iutzeler
Lab. J. Kuntzmann, CNRS, Grenoble, France
Jérôme Malick
MINES ParisTech, PSL - Research University, CMA - Centre de Mathématiques Appliquées, Sophia Antipolis, France
Welington de Oliveira

Authors

Franck Iutzeler
View author publications
You can also search for this author in PubMed Google Scholar
Jérôme Malick
View author publications
You can also search for this author in PubMed Google Scholar
Welington de Oliveira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Welington de Oliveira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Iutzeler, F., Malick, J. & de Oliveira, W. Asynchronous level bundle methods. Math. Program. 184, 319–348 (2020). https://doi.org/10.1007/s10107-019-01414-y

Download citation

Received: 18 September 2018
Accepted: 02 July 2019
Published: 12 July 2019
Issue Date: November 2020
DOI: https://doi.org/10.1007/s10107-019-01414-y

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asynchronous level bundle methods

Abstract

Access this article

Similar content being viewed by others

Quantile-based random sparse Kaczmarz for corrupted and noisy linear systems

Quantum bridge analytics I: a tutorial on formulating and using QUBO models

Parallelizing the dual revised simplex method

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Asynchronous level bundle methods

Abstract

Access this article

Similar content being viewed by others

Quantile-based random sparse Kaczmarz for corrupted and noisy linear systems

Quantum bridge analytics I: a tutorial on formulating and using QUBO models

Parallelizing the dual revised simplex method

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation