Abstract
Data-flow computing is a natural and convenient paradigm for expressing parallelism. This is particularly true for tools that automatically extract the data dependencies among the tasks while allowing to exploit both distributed and shared memory parallelism. This is the case of UPC++ DepSpawn, a new task-based library developed on UPC++ (Unified Parallel C++), a library for parallel computing on a Partitioned Global Address Space (PGAS) environment, and the well-known Intel TBB (Threading Building Blocks) library for multithreading. In this paper we present and evaluate the evolution of this library after changing its engine for shared memory parallelism and adapting it to the newest version of UPC++, which differs very strongly from the original version on which UPC++ DepSpawn was developed. As we will see, while keeping the same high level of programmability, the new version is on average 9.3% faster than the old one, the maximum speedup being 66.3%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
oneAPI Threading Building Blocks (oneTBB). https://github.com/oneapi-src/oneTBB. Accessed 26 Mar 2022
Agullo, E., Aumage, O., Faverge, M., Furmento, N., Pruvost, F., Sergent, M., Thibault, S.: Harnessing clusters of hybrid nodes with a sequential task-based programming model. In: Intl. Workshop on Parallel Matrix Algorithms and Applications (PMAA 2014), July 2014
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency Comput. Practice Exp. 23(2), 187–198 (2011)
Bachan, J., et al.: UPC++: a high-performance communication framework for asynchronous computation. In: 2019 IEEE Intl. Parallel and Distributed Processing Symposium (IPDPS), pp. 963–973, May 2019
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 66:1–66:11 (2012)
Bonachea, D.: Gasnet specification. Technical report CSD-02-1207, University of California at Berkeley, Berkeley, CA, USA, October 2002
Bonachea, D., Hargrove, P.H.: GASNet-EX: a high-performance, portable communication library for exascale. In: Languages and Compilers for Parallel Computing, LCPC 2019, pp. 138–158 (2019)
Bonachea, D., Kamil, A.: UPC++ v1.0 Specification, Revision 2021.3.0. Technical report LBNL-2001388, Lawrence Berkeley National Laboratory, March 2021
Bosilca, G., et al.: Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp. 1432–1441, May 2011
Bosilca, G., Bouteiller, A., Danalis, A., Hérault, T., Lemarinier, P., Dongarra, J.: DAGuE: a generic distributed DAG engine for high performance computing. Parallel Comput. 38(1–2), 37–51 (2012)
Bueno, J., Martorell, X., Badia, R.M., Ayguadé, E., Labarta, J.: Implementing OmpSs support for regions of data in architectures with multiple address spaces. In: 27th International Conference on Supercomputing, ICS 2013, pp. 359–368 (2013)
Burke, M.G., Knobe, K., Newton, R., Sarkar, V.: UPC language specifications, v1.2. Technical report LBNL-59208, Lawrence Berkeley National Lab (2005)
Cosnard, M., Loi, M.: Automatic task graph generation techniques. In: 28th Annual Hawaii International Conference on System Sciences, HICSS’28, vol. 2, pp. 113–122, January 1995
Fraguela, B.B., Andrade, D.: Easy dataflow programming in clusters with UPC++ DepSpawn. IEEE Trans. Parallel Distrib. Syst. 30(6), 1267–1282 (2019)
Fraguela, B.B., Andrade, D.: High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn. J. Supercomput. 77(7), 7676–7689 (2021). https://doi.org/10.1007/s11227-020-03607-1
Fraguela, B.B., Bikshandi, G., Guo, J., Garzarán, M.J., Padua, D., von Praun, C.: Optimization techniques for efficient HTA programs. Parallel Comput. 38(9), 465–484 (2012)
González, C.H., Fraguela, B.B.: A framework for argument-based task synchronization with automatic detection of dependencies. Parallel Comput. 39(9), 475–489 (2013)
Reyes, R., Brown, G., Burns, R., Wong, M.: Sycl 2020: more than meets the eye. In: International Workshop on OpenCL, IWOCL 2020 (2020)
Slaughter, E., Lee, W., Treichler, S., Bauer, M., Aiken, A.: Regent: a high-productivity programming language for HPC with logical regions. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 81:1–81:12 (2015)
Tejedor, E., Farreras, M., Grove, D., Badia, R.M., Almasi, G., Labarta, J.: A high-productivity task-based programming model for clusters. Concurrency Comput. Practice Exp. 24(18), 2421–2448 (2012)
Wozniak, J.M., Armstrong, T.G., Wilde, M., Katz, D.S., Lusk, E., Foster, I.T.: Swift/T: Large-scale application composition via distributed-memory dataflow processing. In: 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 95–102, May 2013
YarKhan, A., Kurzak, J., Luszczek, P., Dongarra, J.: Porting the PLASMA numerical library to the OpenMP standard. Int. J. Parallel Program. 45(3), 612–633 (2017)
Zheng, Y., Kamil, A., Driscoll, M.B., Shan, H., Yelick, K.: UPC++: a PGAS extension for C++. In: IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS 2014), pp. 1105–1114, May 2014
Acknowledgements
This research was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00/AEI/10.13039/501100011033), and by the Xunta de Galicia co-founded by the European Regional Development Fund (ERDF) under the Consolidation Programme of Competitive Reference Groups (ED431C 2021/30). We acknowledge also the support from the Centro Singular de Investigación de Galicia “CITIC”, funded by Xunta de Galicia and the European Union (European Regional Development Fund- Galicia 2014–2020 Program), by grant ED431G 2019/01. Finally, we acknowledge the Centro de Supercomputación de Galicia (CESGA) for the use of their computers.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Fraguela, B.B., Andrade, D. (2022). The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13350. Springer, Cham. https://doi.org/10.1007/978-3-031-08751-6_55
Download citation
DOI: https://doi.org/10.1007/978-3-031-08751-6_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08750-9
Online ISBN: 978-3-031-08751-6
eBook Packages: Computer ScienceComputer Science (R0)