Algorithms for Optimizing the Execution of Parallel Programs on High-Performance Systems When Solving Problems of Modeling Physical Processes

Pavsky, K. V.; Kurnosov, M. G.; Efimov, A. V.; Kramarenko, K. E.; Peryshkova, E. N.; Polyakov, A. Yu.

doi:10.3103/S8756699021050113

Algorithms for Optimizing the Execution of Parallel Programs on High-Performance Systems When Solving Problems of Modeling Physical Processes

Published: 18 March 2022

Volume 57, pages 552–560, (2021)
Cite this article

Optoelectronics, Instrumentation and Data Processing Aims and scope

K. V. Pavsky^1,2,
M. G. Kurnosov^1,2,
A. V. Efimov^1,2,
K. E. Kramarenko^1,2,
E. N. Peryshkova^1,2 &
…
A. Yu. Polyakov³

33 Accesses
Explore all metrics

Abstract

Algorithms are proposed to improve the efficiency of parallel programs execution on high-performance computer systems, in particular, when solving problems of modeling physical processes. The developed algorithms are focused on optimizing the performance of collective operations on multiprocessor SMP/NUMA nodes in the MPI standard. Read-write blocking algorithms increase the efficiency of synchronization of access to shared memory relative to the algorithms used in the Open PMIx library.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient High-Level Programming in Plain Java

Article 05 December 2022

Parallelizing the dual revised simplex method

Article Open access 14 December 2017

GPU Architecture

REFERENCES

A. V. Dvurechenskii and A. I. Yakimov, ‘‘Silicon-based nanoheterostructures with quantum dots,’’ in Advances in Semiconductor Nanostructures: Growth, Characterization, Properties and Applications, Ed. by A. V. Latyshev, A. V. Dvurechenskii, and A. L. Aseev (Elsevier, Amsterdam, 2017), pp. 59–99. https://doi.org/10.1016/B978-0-12-810512-2.00004-4
Supercomputer Fugaku. https://www.fujitsu.com/global/about/innovation/fugaku/. Cited June 29, 2021.
R. L. Graham and G. Shipman, ‘‘MPI support for multi-core architectures: Optimized shared memory collectives,’’ in Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2008, Ed. by A. Lastovetsky, T. Kechadi, and J. Dongarra, Lecture Notes in Computer Science, vol. 5205 (Springer, Berlin, 2008), pp. 130–140. https://doi.org/10.1007/978-3-540-87475-1_21
S. Jain, R. Kaleem, M. G. Balmana, A. Langer, D. Durnov, A. Sannikov, and M. Garzaran, ‘‘Framework for scalable intra-node collective operations using shared memory,’’ in SC18: Int. Conf. for High Performance Computing, Networking, Storage and Analysis, Dallas, 2018 (IEEE, 2018), pp. 374–385. https://doi.org/10.1109/SC.2018.00032
M.-S. Wu, R. A. Kendall, and S. Aluru, ‘‘Exploring collective communications on a cluster of SMPs,’’ Proc. Seventh Int. Conf. on High Performance Computing and Grid in Asia Pacific Region, Tokyo, 2004 (IEEE, 2004), pp. 114–117. https://doi.org/10.1109/HPCASIA.2004.1324024
J. Bruck, Ch.-T. Ho, Sh. Kipnis, E. Upfal, and D. Weathersby, ‘‘Efficient algorithms for all-to-all communications in multiport message passing systems,’’ IEEE Trans. Parallel Distrib. Syst. 8, 1143–1156 (1997). https://doi.org/10.1109/71.642949
Article Google Scholar
R. Thakur, R. Rabenseifner, and W. Gropp, ‘‘Optimization of collective communication operations in MPICH,’’ Int. J. High Perform. Comput. Appl. 19, 49–66 (2005). https://doi.org/10.1177/1094342005051521
Article Google Scholar
P. Balaji, D. Buntinas, D. Goodell, W. Gropp, T. Hoefler, S. Kumar, E. Lusk, R. Thakur, and J. L. Traff, ‘‘MPI on millions of cores,’’ Parallel Process. Lett. 21, 45–60 (2011). https://doi.org/10.1142/S0129626411000060
Article MathSciNet Google Scholar
S. Li, T. Hoefler, and M. Snir, ‘‘NUMA-aware shared memory collective communication for MPI,’’ in Proc. of the 22nd Int. Symp. on High-Performance Parallel and Distributed Computing, New York, 2013 (Association for Computing Machinery, New York, 2013), pp. 85–96. https://doi.org/10.1145/2462902.2462903
A. Polyakov, B. Karasev, J. Hursey, J. Ladd, M. Brinskii, and E. Shipunova, ‘‘A performance analysis and optimization of PMIx-based HPC software stacks,’’ in Proc. of the 26th Europ. MPI Users’ Group Meeting. Zurich, 2019, Ed. by T. Hoefler and J. L. Traff (Association for Computing Machinery, New York, 2019), p. 9. https://doi.org/10.1145/3343211.3343220
PMIx Consortium. 2017–2018. PMIx-based Reference RunTime Environment (PRRTE). https://github.com/pmix/prrte. Cited June 25, 2021.
R. H. Castain, D. Solt, J. Hursey, and A. Bouteiller, ‘‘PMIx: Process management for exascale environments,’’ in Proc. of the 24th Europ. MPI Users’ Group Meeting, New York, 2017 (Association for Computing Machinery, New York, 2017), p. 14. https://doi.org/10.1145/3127024.3127027
IEEE Std 1003.1-2017: IEEE Standard for Information Technology–Portable Operating System Interface (POSIX(R)) Base Specifications. Iss. 7 (Revision of IEEE Std 1003.1-2008) (IEEE, 2018) https://doi.org/10.1109/IEEESTD.2018.8277153. Cited June 25, 2021.
Microbenchmark. https://github.com/artpol84/poc/tree/master/arch/concurrency/locking/shmem_locking. Cited June 25, 2021.

Download references

Funding

The work was carried out within the framework of the state assignment of the Rzhanov Institute of Semiconductor Physics, Siberian Branch, Russian Academy of Sciences (no. 0242-2021-0011), and with the support of the Russian Foundation for Basic Research (grant no. 20-07-00039).

Author information

Authors and Affiliations

Rzhanov Institute of Semiconductor Physics, Siberian Branch, Russian Academy of Sciences, 630090, Novosibirsk, Russia
K. V. Pavsky, M. G. Kurnosov, A. V. Efimov, K. E. Kramarenko & E. N. Peryshkova
Siberian State University of Telecommunications and Information Sciences, 630102, Novosibirsk, Russia
K. V. Pavsky, M. G. Kurnosov, A. V. Efimov, K. E. Kramarenko & E. N. Peryshkova
Networking SW & Sys Arch NVIDIA Corporation, 2788, San Tomas Expy, CA 95051, Santa Clara, USA
A. Yu. Polyakov

Authors

K. V. Pavsky
View author publications
You can also search for this author in PubMed Google Scholar
M. G. Kurnosov
View author publications
You can also search for this author in PubMed Google Scholar
A. V. Efimov
View author publications
You can also search for this author in PubMed Google Scholar
K. E. Kramarenko
View author publications
You can also search for this author in PubMed Google Scholar
E. N. Peryshkova
View author publications
You can also search for this author in PubMed Google Scholar
A. Yu. Polyakov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. V. Pavsky.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Translated by T. N. Sokolova

About this article

Cite this article

Pavsky, K.V., Kurnosov, M.G., Efimov, A.V. et al. Algorithms for Optimizing the Execution of Parallel Programs on High-Performance Systems When Solving Problems of Modeling Physical Processes. Optoelectron.Instrument.Proc. 57, 552–560 (2021). https://doi.org/10.3103/S8756699021050113

Download citation

Received: 04 August 2021
Revised: 21 August 2021
Accepted: 23 August 2021
Published: 18 March 2022
Issue Date: September 2021
DOI: https://doi.org/10.3103/S8756699021050113

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions