The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism

Fraguela, Basilio B.; Andrade, Diego

doi:10.1007/978-3-031-08751-6_55

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13350))

Included in the following conference series:

International Conference on Computational Science

1083 Accesses
2 Citations

Abstract

Data-flow computing is a natural and convenient paradigm for expressing parallelism. This is particularly true for tools that automatically extract the data dependencies among the tasks while allowing to exploit both distributed and shared memory parallelism. This is the case of UPC++ DepSpawn, a new task-based library developed on UPC++ (Unified Parallel C++), a library for parallel computing on a Partitioned Global Address Space (PGAS) environment, and the well-known Intel TBB (Threading Building Blocks) library for multithreading. In this paper we present and evaluate the evolution of this library after changing its engine for shared memory parallelism and adapting it to the newest version of UPC++, which differs very strongly from the original version on which UPC++ DepSpawn was developed. As we will see, while keeping the same high level of programmability, the new version is on average 9.3% faster than the old one, the maximum speedup being 66.3%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

oneAPI Threading Building Blocks (oneTBB). https://github.com/oneapi-src/oneTBB. Accessed 26 Mar 2022
Agullo, E., Aumage, O., Faverge, M., Furmento, N., Pruvost, F., Sergent, M., Thibault, S.: Harnessing clusters of hybrid nodes with a sequential task-based programming model. In: Intl. Workshop on Parallel Matrix Algorithms and Applications (PMAA 2014), July 2014
Google Scholar
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency Comput. Practice Exp. 23(2), 187–198 (2011)
Article Google Scholar
Bachan, J., et al.: UPC++: a high-performance communication framework for asynchronous computation. In: 2019 IEEE Intl. Parallel and Distributed Processing Symposium (IPDPS), pp. 963–973, May 2019
Google Scholar
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 66:1–66:11 (2012)
Google Scholar
Bonachea, D.: Gasnet specification. Technical report CSD-02-1207, University of California at Berkeley, Berkeley, CA, USA, October 2002
Google Scholar
Bonachea, D., Hargrove, P.H.: GASNet-EX: a high-performance, portable communication library for exascale. In: Languages and Compilers for Parallel Computing, LCPC 2019, pp. 138–158 (2019)
Google Scholar
Bonachea, D., Kamil, A.: UPC++ v1.0 Specification, Revision 2021.3.0. Technical report LBNL-2001388, Lawrence Berkeley National Laboratory, March 2021
Google Scholar
Bosilca, G., et al.: Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp. 1432–1441, May 2011
Google Scholar
Bosilca, G., Bouteiller, A., Danalis, A., Hérault, T., Lemarinier, P., Dongarra, J.: DAGuE: a generic distributed DAG engine for high performance computing. Parallel Comput. 38(1–2), 37–51 (2012)
Article Google Scholar
Bueno, J., Martorell, X., Badia, R.M., Ayguadé, E., Labarta, J.: Implementing OmpSs support for regions of data in architectures with multiple address spaces. In: 27th International Conference on Supercomputing, ICS 2013, pp. 359–368 (2013)
Google Scholar
Burke, M.G., Knobe, K., Newton, R., Sarkar, V.: UPC language specifications, v1.2. Technical report LBNL-59208, Lawrence Berkeley National Lab (2005)
Google Scholar
Cosnard, M., Loi, M.: Automatic task graph generation techniques. In: 28th Annual Hawaii International Conference on System Sciences, HICSS’28, vol. 2, pp. 113–122, January 1995
Google Scholar
Fraguela, B.B., Andrade, D.: Easy dataflow programming in clusters with UPC++ DepSpawn. IEEE Trans. Parallel Distrib. Syst. 30(6), 1267–1282 (2019)
Article Google Scholar
Fraguela, B.B., Andrade, D.: High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn. J. Supercomput. 77(7), 7676–7689 (2021). https://doi.org/10.1007/s11227-020-03607-1
Article Google Scholar
Fraguela, B.B., Bikshandi, G., Guo, J., Garzarán, M.J., Padua, D., von Praun, C.: Optimization techniques for efficient HTA programs. Parallel Comput. 38(9), 465–484 (2012)
Article Google Scholar
González, C.H., Fraguela, B.B.: A framework for argument-based task synchronization with automatic detection of dependencies. Parallel Comput. 39(9), 475–489 (2013)
Article Google Scholar
Reyes, R., Brown, G., Burns, R., Wong, M.: Sycl 2020: more than meets the eye. In: International Workshop on OpenCL, IWOCL 2020 (2020)
Google Scholar
Slaughter, E., Lee, W., Treichler, S., Bauer, M., Aiken, A.: Regent: a high-productivity programming language for HPC with logical regions. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 81:1–81:12 (2015)
Google Scholar
Tejedor, E., Farreras, M., Grove, D., Badia, R.M., Almasi, G., Labarta, J.: A high-productivity task-based programming model for clusters. Concurrency Comput. Practice Exp. 24(18), 2421–2448 (2012)
Article Google Scholar
Wozniak, J.M., Armstrong, T.G., Wilde, M., Katz, D.S., Lusk, E., Foster, I.T.: Swift/T: Large-scale application composition via distributed-memory dataflow processing. In: 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 95–102, May 2013
Google Scholar
YarKhan, A., Kurzak, J., Luszczek, P., Dongarra, J.: Porting the PLASMA numerical library to the OpenMP standard. Int. J. Parallel Program. 45(3), 612–633 (2017)
Article Google Scholar
Zheng, Y., Kamil, A., Driscoll, M.B., Shan, H., Yelick, K.: UPC++: a PGAS extension for C++. In: IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS 2014), pp. 1105–1114, May 2014
Google Scholar

Download references

Acknowledgements

This research was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00/AEI/10.13039/501100011033), and by the Xunta de Galicia co-founded by the European Regional Development Fund (ERDF) under the Consolidation Programme of Competitive Reference Groups (ED431C 2021/30). We acknowledge also the support from the Centro Singular de Investigación de Galicia “CITIC”, funded by Xunta de Galicia and the European Union (European Regional Development Fund- Galicia 2014–2020 Program), by grant ED431G 2019/01. Finally, we acknowledge the Centro de Supercomputación de Galicia (CESGA) for the use of their computers.

Author information

Authors and Affiliations

Universidade da Coruña, CITIC, Grupo de Arquitectura de Computadores. Facultade de Informática, Campus de Elviña, S/N. 15071, A Coruña, Spain
Basilio B. Fraguela & Diego Andrade

Authors

Basilio B. Fraguela
View author publications
You can also search for this author in PubMed Google Scholar
Diego Andrade
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Basilio B. Fraguela .

Editor information

Editors and Affiliations

Brunel University London, London, UK
Derek Groen
University of Amsterdam, Amsterdam, The Netherlands
Clélia de Mulatier
AGH University of Science and Technology, Krakow, Poland
Maciej Paszynski
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Tennessee at Knoxville, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fraguela, B.B., Andrade, D. (2022). The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13350. Springer, Cham. https://doi.org/10.1007/978-3-031-08751-6_55

Download citation

DOI: https://doi.org/10.1007/978-3-031-08751-6_55
Published: 15 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08750-9
Online ISBN: 978-3-031-08751-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism