Optimal Data Splitting in Distributed Optimization for Machine Learning

Medyakov, D.; Molodtsov, G.; Beznosikov, A.; Gasnikov, A.

doi:10.1134/S1064562423701600

Optimal Data Splitting in Distributed Optimization for Machine Learning

Published: 25 March 2024

Volume 108, pages S465–S475, (2023)
Cite this article

Doklady Mathematics Aims and scope Submit manuscript

D. Medyakov¹,
G. Molodtsov¹,
A. Beznosikov¹ &
…
A. Gasnikov¹

44 Accesses
Explore all metrics

Abstract

The distributed optimization problem has become increasingly relevant recently. It has a lot of advantages such as processing a large amount of data in less time compared to non-distributed methods. However, most distributed approaches suffer from a significant bottleneck—the cost of communications. Therefore, a large amount of research has recently been directed at solving this problem. One such approach uses local data similarity. In particular, there exists an algorithm provably optimally exploiting the similarity property. But this result, as well as results from other works solve the communication bottleneck by focusing only on the fact that communication is significantly more expensive than local computing and does not take into account the various capacities of network devices and the different relationship between communication time and local computing expenses. We consider this setup and the objective of this study is to achieve an optimal ratio of distributed data between the server and local machines for any costs of communications and local computations. The running times of the network are compared between uniform and optimal distributions. The superior theoretical performance of our solutions is experimentally validated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

REFERENCES

J. Verbraeken, M. Wolting, J. Katzy, J. Kloppenburg, T. Verbelen, and J. S. Rellermeyer, “A survey on distributed machine learning,” ACM Comput. Surv. 53 (2), 1–33 (2020).
Article Google Scholar
J. Konečný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon, “Federated learning: Strategies for improving communication efficiency” (2016). https://doi.org/10.48550/arXiv.1610.05492
T. Li, A. K. Sahu, A. Talwalkar, and V. Smith, “Federated learning: Challenges, methods, and future directions,” IEEE Signal Process. Mag. 37 (3), 50–60 (2020).
Google Scholar
P. Kairouz, H. B. McMahan, B. Avent, et al., “Advances and open problems in federated learning,” Found. Trends Mach. Learn. 14 (1–2), 1–210 (2021).
Article Google Scholar
A. Ghosh, R. K. Maity, A. Mazumdar, and K. Ramchandran, “Communication efficient distributed approximate newton method,” in 2020 IEEE International Symposium on Information Theory (ISIT) (IEEE, 2020), pp. 2539–2544.
V. Smith, S. Forte, M. Chenxin, M. Takáč, M. I. Jordan, and M. Jaggi, “Cocoa: A general framework for communication-efficient distributed optimization,” J. Mach. Learn. Res. 18, 230 (2018).
MathSciNet Google Scholar
E. Gorbunov, K. P. Burlachenko, Z. Li, and P. Richtárik, “Marina: Faster non-convex distributed learning with compression,” in International Conference on Machine Learning, PMLR (2021), pp. 3788–3798.
Y. Nesterov et al., Lectures on Convex Optimization (Springer, 2018).
Book Google Scholar
Y. Arjevani and O. Shamir, “Communication complexity of distributed convex learning and optimization,” Advances in Neural Information Processing Systems (2015), Vol. 28.
O. Shamir, N. Srebro, and T. Zhang, “Communication-efficient distributed optimization using an approximate newton-type method,” in International Conference on Machine Learning, PMLR (2014), pp. 1000–1008.
S. Matsushima, H. Yun, X. Zhang, and S. Vishwanathan, “Distributed stochastic optimization of the regularized risk” (2014). https://doi.org/10.48550/arXiv.1406.4363
Y. Tian, G. Scutari, T. Cao, and A. Gasnikov, “Acceleration in distributed optimization under similarity,” in International Conference on Artificial Intelligence and Statistics, PMLR (2022), pp. 5721–5756.
Y. Sun, G. Scutari, and A. Daneshmand, “Distributed optimization based on gradient tracking revisited: Enhancing convergence rate via surrogation,” SIAM J. Optim. 32 (2), 354–385 (2022).
Article MathSciNet Google Scholar
S. J. Reddi, J. Konečný, P. Richtárik, B. Póczós, and A. Smola, “AIDE: Fast and communication efficient distributed optimization” (2016). https://doi.org/10.48550/arXiv.1608.06879
H. Hendrikx, L. Xiao, S. Bubeck, F. Bach, and L. Massoulie, “Statistically preconditioned accelerated gradient method for distributed optimization,” in International Conference on Machine Learning, PMLR (2020), pp. 4203–4227.
A. Beznosikov, G. Scutari, A. Rogozin, and A. Gasnikov, “Distributed saddle-point problems under data similarity,” Adv. Neural Inf. Process. Syst. 34, 8172–8184 (2021).
Google Scholar
D. Kovalev, A. Beznosikov, E. Borodich, A. Gasnikov, and G. Scutari, “Optimal gradient sliding and its application to optimal distributed optimization under similarity,” Adv. Neural Inf. Process. Syst. 35, 33494–33507 (2022).
Google Scholar
B. T. Polyak, “Newton’s method and its use in optimization,” Eur. J. Oper. Res. 181 (3), 1086–1096 (2007).
Article MathSciNet Google Scholar
C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM Trans. Intell. Syst. Technol. 2 (3), 1–27 (2011).
Article Google Scholar
D. Kim and J. A. Fessler, “Optimizing the efficiency of first-order methods for decreasing the gradient of smooth convex functions,” J. Optim. Theory Appl. 188 (1), 192–219 (2021).
Article MathSciNet Google Scholar

Download references

Funding

The research of A. Beznosikov was supported by Russian Science Foundation (project no. 23-11-00229).

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology, Dolgoprudny, Russia
D. Medyakov, G. Molodtsov, A. Beznosikov & A. Gasnikov

Authors

D. Medyakov
View author publications
You can also search for this author in PubMed Google Scholar
G. Molodtsov
View author publications
You can also search for this author in PubMed Google Scholar
A. Beznosikov
View author publications
You can also search for this author in PubMed Google Scholar
A. Gasnikov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. Medyakov.

Ethics declarations

The authors of this work declare that they have no conflicts of interest.

Additional information

Publisher’s Note.

Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Medyakov, D., Molodtsov, G., Beznosikov, A. et al. Optimal Data Splitting in Distributed Optimization for Machine Learning. Dokl. Math. 108 (Suppl 2), S465–S475 (2023). https://doi.org/10.1134/S1064562423701600

Download citation

Received: 02 September 2023
Revised: 15 September 2023
Accepted: 15 October 2023
Published: 25 March 2024
Issue Date: December 2023
DOI: https://doi.org/10.1134/S1064562423701600

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal Data Splitting in Distributed Optimization for Machine Learning

Abstract

Access this article

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s Note.

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation