Fully Distributed Deep Neural Network: F2D2N

Leite, Ernesto; Mourlin, Fabrice; Paradinas, Pierre

doi:10.1007/978-3-031-52426-4_15

Ernesto Leite¹²,
Fabrice Mourlin¹³ &
Pierre Paradinas¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14482))

Included in the following conference series:

International Conference on Mobile, Secure, and Programmable Networking

96 Accesses

Abstract

Recent advances in Artificial Intelligence (AI) have accelerated the adoption of AI at a pace never seen before. Large Language Models (LLM) trained on tens of billions of parameters show the crucial importance of parallelizing models. Different techniques exist for distributing Deep Neural Networks but they are challenging to implement. The cost of training GPU-based architectures is also becoming prohibitive. In this document we present a distributed approach that is easier to implement where data and model are distributed in processing units hosted on a cluster of machines based on CPUs or GPUs. Communication is done by message passing. The model is distributed over the cluster and stored locally or on a datalake. We prototyped this approach using open sources libraries and we present the benefits this implementation can bring.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PyDTNN: A user-friendly and extensible framework for distributed deep learning

Article 22 February 2021

Theano-MPI: A Theano-Based Distributed Training Framework

A parallel and distributed stochastic gradient descent implementation using commodity clusters

Article Open access 14 February 2019

References

Ben-Nun, T., Hoefler, T.: Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis (2018)
Google Scholar
Chen, C.-C., Yang, C.-L., Cheng, H.-Y.: Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform (2019)
Google Scholar
Deng, L.: The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29(6), 141–142 (2012)
Article Google Scholar
Hinton, G.: The Forward-Forward Algorithm: Some Preliminary Investigations (2022)
Google Scholar
Huang, Y., et al.: GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism (2019)
Google Scholar
Jiang, Y., Fu, F., Miao, X., Nie, X., Cui, B.: OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning (2023)
Google Scholar
Li, M.: Scaling distributed machine learning with the parameter server. In: Proceedings of the 2014 International Conference on Big Data Science and Computing, Beijing China, p. 1. ACM (2014)
Google Scholar
Li, S., et al.: PyTorch Distributed: Experiences on Accelerating Data Parallel Training (2020)
Google Scholar
Nielsen, M.A.: Neural Networks and Deep Learning (2015)
Google Scholar
Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., Catanzaro, B.: Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism (2020)
Google Scholar
Singh, S., Sating, Z., Bhatele, A.: Communication-minimizing Asynchronous Tensor Parallelism (2023). arXiv:2305.13525 [cs]
Wang, B., Xu, Q., Bian, Z., You, Y.: Tesseract: parallelize the tensor parallelism efficiently. In: Proceedings of the 51st International Conference on Parallel Processing, pp. 1–11 (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Conservatoire National des Arts et Métiers, 292 rue Saint Martin, 75141, Paris, France
Ernesto Leite & Pierre Paradinas
UPEC, 61 Av. du Général de Gaulle, 94000, Créteil, France
Fabrice Mourlin

Authors

Ernesto Leite
View author publications
You can also search for this author in PubMed Google Scholar
Fabrice Mourlin
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Paradinas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ernesto Leite .

Editor information

Editors and Affiliations

Cedric Lab, Cnam, Paris, France
Samia Bouzefrane
Trasna Solutions, Rousset, France
Soumya Banerjee
University Paris-Est Créteil, Créteil, France
Fabrice Mourlin
Cedric Lab, Cnam, Paris, France
Selma Boumerdassi
LIGM, ESIEE, Noisy-le-Grand, France
Éric Renault

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leite, E., Mourlin, F., Paradinas, P. (2024). Fully Distributed Deep Neural Network: F2D2N. In: Bouzefrane, S., Banerjee, S., Mourlin, F., Boumerdassi, S., Renault, É. (eds) Mobile, Secure, and Programmable Networking. MSPN 2023. Lecture Notes in Computer Science, vol 14482. Springer, Cham. https://doi.org/10.1007/978-3-031-52426-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-52426-4_15
Published: 25 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-52425-7
Online ISBN: 978-3-031-52426-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fully Distributed Deep Neural Network: F2D2N

Abstract

Access this chapter

Similar content being viewed by others

PyDTNN: A user-friendly and extensible framework for distributed deep learning

Theano-MPI: A Theano-Based Distributed Training Framework

A parallel and distributed stochastic gradient descent implementation using commodity clusters

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Fully Distributed Deep Neural Network: F2D2N

Abstract

Access this chapter

Similar content being viewed by others

PyDTNN: A user-friendly and extensible framework for distributed deep learning

Theano-MPI: A Theano-Based Distributed Training Framework

A parallel and distributed stochastic gradient descent implementation using commodity clusters

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation