Using Virtualization Approaches to Solve Deep Learning Problems in Voluntary Distributed Computing Projects

Kurochkin, Ilya; Papanov, Valeriy

doi:10.1007/978-3-031-49435-2_6

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14389))

Included in the following conference series:

Russian Supercomputing Days

145 Accesses

Abstract

The task of training deep neural networks on a large amount of data requires a lot of resources. The solution of such a problem is often impossible to carry out on one computing device in an adequate time. Distributed computing systems can be used to solve deep learning problems. Such systems may consist of heterogeneous computing nodes with different computing power. To implement deep learning on a distributed heterogeneous system, it is necessary to solve the problem of utilization of all available resources. The solution to this problem is to configure the task delivery system of a distributed system. And to expand the number of computing nodes involved, it is necessary to use virtualization. The article discusses two types of virtualization for grid systems when solving deep learning problems. The features of the implementation of computational applications for training deep neural networks for solving the problem of image classification are discussed. The results of distributed deep learning on a public grid system are discussed. A comparative analysis of two virtualization approaches is given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Foster, I., Kesselman, C.: The grid 2: blueprint for a new computing infrastructure (2004)
Google Scholar
Anderson, D.P.: BOINC: a platform for volunteer computing. J. Grid Comput. 18(1), 99–122 (2019). https://doi.org/10.1007/s10723-019-09497-9
Article Google Scholar
Bockelman, B., Livny, M., Lin, B., Prelz, F.: Principles, technologies, and time: the translational journey of the HTCondor-CE. J. Comput. Sci. 52 (2021). https://doi.org/10.1016/j.jocs.2020.101213
Borges, G., et al.: Sun grid engine, a new scheduler for EGEE middleware. In: BERGRID–Iberian Grid Infrastructure Conference (2007)
Google Scholar
Da, T., Morais, S.: Survey on frameworks for distributed computing: Hadoop, spark and storm. In: Proceedings of the 10th Doctoral Symposium in Informatics Engineering - DSIE’15 (2015)
Google Scholar
Zaharia, M., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59, 56–65 (2016). https://doi.org/10.1145/2934664
Article Google Scholar
Ben-Nun, T., Hoefler, T.: Demystifying parallel and distributed deep learning: an in-depth concurrency analysis. ACM Comput. Surv. 52 (2019). https://doi.org/10.1145/3320060
Bellavista, P., Foschini, L., Mora, A.: Decentralised learning in federated deployment environments: a system-level survey (2021). https://doi.org/10.1145/3429252
Abdulrahman, S., Tout, H., Ould-Slimane, H., Mourad, A., Talhi, C., Guizani, M.: A survey on federated learning: the journey from centralized to distributed on-site learning and beyond. IEEE Internet Things J. 8, 5476–5497 (2021). https://doi.org/10.1109/JIOT.2020.3030072
Article Google Scholar
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (2015)
Google Scholar
BOINC projects: List BOINC projects. https://boinc.berkeley.edu/projects.php. Accessed 22 May 2023
Top 500. https://top500.org/lists/top500/2023/06/. Accessed 01 Aug 2023
Watada, J., Roy, A., Kadikar, R., Pham, H., Xu, B.: Emerging trends, techniques and open issues of containerization: a review (2019). https://doi.org/10.1109/ACCESS.2019.2945930
Molto, G., Caballer, M., Perez, A., Alfonso, C. De, Blanquer, I.: Coherent application delivery on hybrid distributed computing infrastructures of virtual machines and docker containers. In: Proceedings - 2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2017, pp. 486–490. Institute of Electrical and Electronics Engineers Inc. (2017). https://doi.org/10.1109/PDP.2017.29
Chung, M.T., Quang-Hung, N., Nguyen, M.T., Thoai, N.: Using Docker in high performance computing applications. In: 2016 IEEE 6th International Conference on Communications and Electronics, IEEE ICCE 2016, pp. 52–57 (2016). https://doi.org/10.1109/CCE.2016.7562612
Garcia, S., Miller, S.: Great internet Mersenne prime search (GIMPS). In: 100 Years of Math Milestones (2019). https://doi.org/10.1090/mbk/121/84
Kurochkin, I.I., Kostylev, I.S.: Solving the problem of texture images classification using synchronous distributed deep learning on desktop grid systems (2020). https://doi.org/10.1007/978-3-030-64616-5_55

Download references

Acknowledgements

This work was funded by Russian Science Foundation (№ 22-11-00317).

Author information

Authors and Affiliations

Institute for Information Transmission Problems of Russian Academy of Sciences, Moscow, Russia
Ilya Kurochkin
The National University of Science and Technology MISIS, Moscow, Russia
Ilya Kurochkin & Valeriy Papanov

Authors

Ilya Kurochkin
View author publications
You can also search for this author in PubMed Google Scholar
Valeriy Papanov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ilya Kurochkin .

Editor information

Editors and Affiliations

Research Computing Center (RCC), Moscow State University, Moscow, Russia
Vladimir Voevodin
Research Computing Center (RCC), Moscow State University, Moscow, Russia
Sergey Sobolev
Russian Academy of Sciences (RAS), Keldysh Institute of Applied Mathematics, Moscow, Russia
Mikhail Yakobovskiy
Russian Federal Nuclear Center, Sarov, Russia
Rashit Shagaliev

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kurochkin, I., Papanov, V. (2023). Using Virtualization Approaches to Solve Deep Learning Problems in Voluntary Distributed Computing Projects. In: Voevodin, V., Sobolev, S., Yakobovskiy, M., Shagaliev, R. (eds) Supercomputing. RuSCDays 2023. Lecture Notes in Computer Science, vol 14389. Springer, Cham. https://doi.org/10.1007/978-3-031-49435-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-49435-2_6
Published: 05 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49434-5
Online ISBN: 978-3-031-49435-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics