Skip to main content

The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism

  • Conference paper
  • First Online:
Computational Science – ICCS 2022 (ICCS 2022)

Abstract

Data-flow computing is a natural and convenient paradigm for expressing parallelism. This is particularly true for tools that automatically extract the data dependencies among the tasks while allowing to exploit both distributed and shared memory parallelism. This is the case of UPC++ DepSpawn, a new task-based library developed on UPC++ (Unified Parallel C++), a library for parallel computing on a Partitioned Global Address Space (PGAS) environment, and the well-known Intel TBB (Threading Building Blocks) library for multithreading. In this paper we present and evaluate the evolution of this library after changing its engine for shared memory parallelism and adapting it to the newest version of UPC++, which differs very strongly from the original version on which UPC++ DepSpawn was developed. As we will see, while keeping the same high level of programmability, the new version is on average 9.3% faster than the old one, the maximum speedup being 66.3%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. oneAPI Threading Building Blocks (oneTBB). https://github.com/oneapi-src/oneTBB. Accessed 26 Mar 2022

  2. Agullo, E., Aumage, O., Faverge, M., Furmento, N., Pruvost, F., Sergent, M., Thibault, S.: Harnessing clusters of hybrid nodes with a sequential task-based programming model. In: Intl. Workshop on Parallel Matrix Algorithms and Applications (PMAA 2014), July 2014

    Google Scholar 

  3. Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency Comput. Practice Exp. 23(2), 187–198 (2011)

    Article  Google Scholar 

  4. Bachan, J., et al.: UPC++: a high-performance communication framework for asynchronous computation. In: 2019 IEEE Intl. Parallel and Distributed Processing Symposium (IPDPS), pp. 963–973, May 2019

    Google Scholar 

  5. Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 66:1–66:11 (2012)

    Google Scholar 

  6. Bonachea, D.: Gasnet specification. Technical report CSD-02-1207, University of California at Berkeley, Berkeley, CA, USA, October 2002

    Google Scholar 

  7. Bonachea, D., Hargrove, P.H.: GASNet-EX: a high-performance, portable communication library for exascale. In: Languages and Compilers for Parallel Computing, LCPC 2019, pp. 138–158 (2019)

    Google Scholar 

  8. Bonachea, D., Kamil, A.: UPC++ v1.0 Specification, Revision 2021.3.0. Technical report LBNL-2001388, Lawrence Berkeley National Laboratory, March 2021

    Google Scholar 

  9. Bosilca, G., et al.: Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp. 1432–1441, May 2011

    Google Scholar 

  10. Bosilca, G., Bouteiller, A., Danalis, A., Hérault, T., Lemarinier, P., Dongarra, J.: DAGuE: a generic distributed DAG engine for high performance computing. Parallel Comput. 38(1–2), 37–51 (2012)

    Article  Google Scholar 

  11. Bueno, J., Martorell, X., Badia, R.M., Ayguadé, E., Labarta, J.: Implementing OmpSs support for regions of data in architectures with multiple address spaces. In: 27th International Conference on Supercomputing, ICS 2013, pp. 359–368 (2013)

    Google Scholar 

  12. Burke, M.G., Knobe, K., Newton, R., Sarkar, V.: UPC language specifications, v1.2. Technical report LBNL-59208, Lawrence Berkeley National Lab (2005)

    Google Scholar 

  13. Cosnard, M., Loi, M.: Automatic task graph generation techniques. In: 28th Annual Hawaii International Conference on System Sciences, HICSS’28, vol. 2, pp. 113–122, January 1995

    Google Scholar 

  14. Fraguela, B.B., Andrade, D.: Easy dataflow programming in clusters with UPC++ DepSpawn. IEEE Trans. Parallel Distrib. Syst. 30(6), 1267–1282 (2019)

    Article  Google Scholar 

  15. Fraguela, B.B., Andrade, D.: High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn. J. Supercomput. 77(7), 7676–7689 (2021). https://doi.org/10.1007/s11227-020-03607-1

    Article  Google Scholar 

  16. Fraguela, B.B., Bikshandi, G., Guo, J., Garzarán, M.J., Padua, D., von Praun, C.: Optimization techniques for efficient HTA programs. Parallel Comput. 38(9), 465–484 (2012)

    Article  Google Scholar 

  17. González, C.H., Fraguela, B.B.: A framework for argument-based task synchronization with automatic detection of dependencies. Parallel Comput. 39(9), 475–489 (2013)

    Article  Google Scholar 

  18. Reyes, R., Brown, G., Burns, R., Wong, M.: Sycl 2020: more than meets the eye. In: International Workshop on OpenCL, IWOCL 2020 (2020)

    Google Scholar 

  19. Slaughter, E., Lee, W., Treichler, S., Bauer, M., Aiken, A.: Regent: a high-productivity programming language for HPC with logical regions. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 81:1–81:12 (2015)

    Google Scholar 

  20. Tejedor, E., Farreras, M., Grove, D., Badia, R.M., Almasi, G., Labarta, J.: A high-productivity task-based programming model for clusters. Concurrency Comput. Practice Exp. 24(18), 2421–2448 (2012)

    Article  Google Scholar 

  21. Wozniak, J.M., Armstrong, T.G., Wilde, M., Katz, D.S., Lusk, E., Foster, I.T.: Swift/T: Large-scale application composition via distributed-memory dataflow processing. In: 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 95–102, May 2013

    Google Scholar 

  22. YarKhan, A., Kurzak, J., Luszczek, P., Dongarra, J.: Porting the PLASMA numerical library to the OpenMP standard. Int. J. Parallel Program. 45(3), 612–633 (2017)

    Article  Google Scholar 

  23. Zheng, Y., Kamil, A., Driscoll, M.B., Shan, H., Yelick, K.: UPC++: a PGAS extension for C++. In: IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS 2014), pp. 1105–1114, May 2014

    Google Scholar 

Download references

Acknowledgements

This research was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00/AEI/10.13039/501100011033), and by the Xunta de Galicia co-founded by the European Regional Development Fund (ERDF) under the Consolidation Programme of Competitive Reference Groups (ED431C 2021/30). We acknowledge also the support from the Centro Singular de Investigación de Galicia “CITIC”, funded by Xunta de Galicia and the European Union (European Regional Development Fund- Galicia 2014–2020 Program), by grant ED431G 2019/01. Finally, we acknowledge the Centro de Supercomputación de Galicia (CESGA) for the use of their computers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Basilio B. Fraguela .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fraguela, B.B., Andrade, D. (2022). The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13350. Springer, Cham. https://doi.org/10.1007/978-3-031-08751-6_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08751-6_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08750-9

  • Online ISBN: 978-3-031-08751-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics