Abstract
In this paper, we consider nonsmooth convex optimization problems with additive structure featuring independent oracles (black-boxes) working in parallel. Existing methods for solving these distributed problems in a general form are synchronous, in the sense that they wait for the responses of all the oracles before performing a new iteration. In this paper, we propose level bundle methods handling asynchronous oracles. These methods require original upper-bounds (using upper-models or scarce coordinations) to deal with asynchronicity. We prove their convergence using variational-analysis techniques and illustrate their practical performance on a Lagrangian decomposition problem.
Similar content being viewed by others
Notes
To better underline our contributions on asynchronicity, we consider first only exact oracles of the \(f^i\) as above. Later in Sect. 5, we explain how our developments easily extend to the case of inexact oracles providing noisy approximations of \((f^i({x}), g^i)\).
Note that Algorithm 2 still needs initial bounds \(f^{{{\,\mathrm{up}\,}}}_1\) and \(f^{{{\,\mathrm{low}\,}}}_1\). These bounds can often be easily estimated from the data of the problem. Otherwise, we can use the standard initialization: call the m oracles at an initial point \(x_1\) and wait for their first responses from which we can compute \(f^{{{\,\mathrm{up}\,}}}_1 = f(x_1) = \sum _i f^i(x_1)\) and \(f^{{{\,\mathrm{low}\,}}}_1\) as the minimum of the linearization \(f(x_1) + {\langle g_1,x-x_1 \rangle }\) over the compact set X. If we do not want to have this synchronous initial step, we may alternatively estimate \(f^{{{\,\mathrm{lev}\,}}}\) and set \(f^{{{\,\mathrm{up}\,}}}_1=+\infty \) and \(f^{{{\,\mathrm{low}\,}}}_1=-\infty \). This would require small changes in the algorithm (in line 15) and in its proof (in Lemma 3). For sake of clarity we stick with the simplest version of the algorithm and the most frequent situation where we can easily estimate \(f^{{{\,\mathrm{up}\,}}}_1\) and \(f^{{{\,\mathrm{low}\,}}}_1\).
As the oracles are assumed to respond in a finite time, the inequality \(\min _{j=1,\ldots ,m} {{\mathtt {a}}(j)} \ge {\bar{k}}\) is guaranteed to be satisfied for k is large enough.
By substituting \(f^i_{x}= f^i({x}) - \eta _{x}^{v,i}\) in the inequality \(f^i(\cdot ) \ge f^i_{x}+ \langle g^i_{x}, \cdot -{x} \rangle -\eta _{x}^{s,i}\) and evaluating at x, we get that \(f(x)\ge f(x) - \eta _x^{v,i} - \eta _x^{s,i}\). This shows that \(\eta ^{v,i}_x+\eta ^{s,i}_x\ge 0\) and in fact \(g_x^i \in \partial _{(\eta ^{v,i}_x+\eta ^{v,i}_x)} f^i(x)\).
As in Sect. 3, \({{\mathtt {a}}(i)}\) is the iteration index of the anterior information provided by oracle i; see Algorithm 2.
References
Arda, A., Feyzmahdavian, H.R., Johansson, M.: Analysis and implementation of an asynchronous optimization algorithm for the parameter server (2016). arXiv preprint arXiv:1610.05507
Bacaud, L., Lemaréchal, C., Renaud, A., Sagastizábal, C.: Bundle methods in stochastic optimal power management: a disaggregated approach using preconditioners. Comput. Optim. Appl. 20, 227–244 (2001)
Bernardes, N.C.: On nested sequences of convex sets in Banach spaces. J. Math. Anal. Appl. 389, 558–561 (2012)
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods, vol. 23. Prentice Hall, Englewood Cliffs (1989)
Briant, O., Claude Lemaréchal, P., Meurdesoif, S.M., Perrot, N., Vanderbeck, F.: Comparison of bundle and classical column generation. Math. Programm. 113, 299–344 (2008)
Bruno, S.V.B., Moraes, L.A.M., de Oliveira, W.: Optimization techniques for the Brazilian natural gas network planning problem. Energy Syst. 8, 81–101 (2017)
de Oliveira, W.: Target radius methods for nonsmooth convex optimization. Oper. Res. Lett. 45, 659–664 (2017)
de Oliveira, W., Sagastizábal, C.: Level bundle methods for oracles with on-demand accuracy. Optim. Methods Softw. 29, 1180–1209 (2014)
de Oliveira, W., Solodov, M.: Bundle methods for inexact data. Technical report (2018)
Dubost, L., Gonzalez, R., Lemaréchal, C.: A primal-proximal heuristic applied to the french unit-commitment problem. Math. Program. 104, 129–151 (2005)
Fischer, F., Helmberg, C.: A parallel bundle framework for asynchronous subspace optimization of nonsmooth convex functions. SIAM J. Optim. 24, 795–822 (2014)
Frangioni, A.: Standard bundle methods: untrusted models and duality. Technical report, Universita di Pisa (2018)
Frangioni, A., Gorgone, E.: Bundle methods for sum-functions with “easy” components: applications to multicommodity network design. Math. Program. 145, 133–161 (2014)
Geoffrion, A.M.: Generalized benders decomposition. J. Optim. Theory Appl. 10, 237–260 (1972)
Hannah, R., Yin, W.: More iterations per second, same quality–why asynchronous algorithms may drastically outperform traditional ones (2017). arXiv preprint arXiv:1708.05136
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis Minimization Algorithms, vol. 305 and 306. Springer, Berlin (1993)
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Advances in Neural Information Processing Systems, pp. 315–323 (2013)
Kim, K., Petra, C., Zavala, V.: An asynchronous bundle-trust-region method for dual decomposition of stochastic mixed-integer programming. SIAM J. Optim. 29, 318–342 (2019)
Kiwiel, K.C.: Proximal level bubdle methods for convex nondiferentiable optimization, saddle-point problems and variational inequalities. Math. Program. 69, 89–109 (1995)
Konecnỳ, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: distributed machine learning for on-device intelligence (2016). arXiv preprint arXiv:1610.02527
Lemaréchal, C.: An extension of davidon methods to nondifferentiable problems. Math. Program. Study 3, 95–109 (1975)
Lemaréchal, C.: Lagrangian relaxation. In: Jünger, M., Naddef, D. (eds.) Computational Combinatorial Optimization: Optimal or Provably Near-Optimal Solutions, pp. 112–156. Springer, Berlin, Heidelberg (2001). https://doi.org/10.1007/3-540-45586-8_4
Lemaréchal, C., Nemirovskii, A., Nesterov, Y.: New variants of bundle methods. Math. Program. 69, 111–147 (1995)
Ma, C., Smith, V., Jaggi, M., Jordan, M., Richtarik, P., Takac, M.: Adding vs. averaging in distributed primal-dual optimization. In: International Conference on Machine Learning, pp. 1973–1982 (2015)
Malick, J., de Oliveira, W., Zaourar, S.: Uncontrolled inexact information within bundle methods. EURO J. Comput. Optim. 5, 5–29 (2017)
Mishchenko, K., Iutzeler, F., Malick, J., Amini, M.-R.: A delay-tolerant proximal-gradient algorithm for distributed learning. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, PMLR, 10–15, pp. 3584–3592 (Jul 2018)
Moritsch, H.W., Pflug, GCh., Siomak, M.: Asynchronous nested optimization algorithms and their parallel implementation. Wuhan Univ. J. Nat. Sci. 6(1–2), 560–567 (2001). https://doi.org/10.1007/BF03160302
Peng, Z., Yangyang, X., Yan, M., Yin, W.: Arock: an algorithmic framework for asynchronous parallel coordinate updates. SIAM J. Sci. Comput. 38, A2851–A2879 (2016)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Heidelberg (1998)
Rockafellar, R.T., Wets, R.J.-B.: Scenarios and policy aggregation in optimization under uncertainty. Math. Oper. Res. 16, 119–147 (1991)
Sagastizábal, C.: Divide to conquer: decomposition methods for energy optimization. Math. Program. 134, 187–222 (2012)
Sagastizábal, C.: A VU-point of view of nonsmooth optimization. Proc. Int. Congr. Math. 3, 3785–3806 (2018)
Shapiro, A., Dentcheva, D., Ruszczyński, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM, Bangkok (2009)
Smulian, V.: On the principle of inclusion in the space of the type \((b)\). Rec. Math. [Mat. Sbornik] N.S. 5(47), 317–328 (1939)
Sun, T., Hannah, R., Yin, W.:Asynchronous coordinate descent under more realistic assumption. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 6183–6191, Curran Associates Inc., Long Beach, California, USA (2017). http://dl.acm.org/citation.cfm?id=3295222.3295366
Tsitsiklis, J., Bertsekas, D., Athans, M.: Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Trans. Autom. Control 31, 803–812 (1986)
van Ackooij, W., de Oliveira, W.: Level bundle methods for constrained convex optimization with various oracles. Comput. Optim. Appl. 57, 555–597 (2014)
van Ackooij, W., Malick, J.: Decomposition algorithm for large-scale two-stage unit-commitment. Ann. Oper. Res. 238, 587–613 (2015)
van Ackooij, W., Frangioni, A.: Incremental bundle methods using upper models. SIAM. J. Optimi. 28(1), 379–410 (2018). https://doi.org/10.1137/16M1089897
Wolf, C., Fábián, C.I., Koberstein, A., Suhl, L.: Applying oracles of on-demand accuracy in two-stage stochastic programming. A computational study. Eur. J. Oper. Res. 239, 437–448 (2014)
Zhang, R., Kwok, J.: Asynchronous distributed ADMM for consensus optimization. In: International Conference on Machine Learning, pp. 1701–1709 (2014)
Acknowledgements
We are grateful to the two referees for their rich feedback on the initial version of our paper. We would like to acknowledge the partial financial support of PGMO (Gaspard Monge Program for Optimization and operations research) of the Hadamard Mathematics Foundation, through the project “Advanced nonsmooth optimization methods for stochastic programming”.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Iutzeler, F., Malick, J. & de Oliveira, W. Asynchronous level bundle methods. Math. Program. 184, 319–348 (2020). https://doi.org/10.1007/s10107-019-01414-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-019-01414-y