Skip to main content

Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach

  • Conference paper
  • First Online:
OpenMP: Heterogenous Execution and Data Movements (IWOMP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9342))

Included in the following conference series:

Abstract

Modern high-performance machines are challenging to program because of the availability of a wide array of compute resources that often requires low-level, specialized knowledge to exploit. OpenMP is an effective directive-based approach that can effectively exploit shared-memory multicores. The recently introduced OpenMP 4.0 standard extends the directive-based approach to exploit accelerators. However, programming clusters still requires the use of other specialized languages or libraries.

In this work we propose the use of the target offloading constructs to program nodes distributed in a cluster. We introduce an abstract model of a cluster that defines a clique of distinct shared-memory domains that are manipulated with the target constructs. We have implemented this model in the LLVM compiler with an OpenMP runtime that supports transparent offloading to nodes in a cluster using MPI. Our initial results on HMMER, a widely used Bioinformatics tool, show excellent scaling behavior with a small constant-factor overhead as compared to a baseline MPI implementation. Our work raises the intriguing possibility of a natural progression of a program compiled for serial execution, to parallel execution on a multicore, to offloading onto accelerators, and finally extendible with minimal additional effort onto a cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For example, the IBM Power® System E880 is configurable up to 16 TB. See http://www-03.ibm.com/systems/power/hardware/e880/.

  2. 2.

    HMMER 3.1b2: http://hmmer.org.

References

  1. Chamberlain, B., Callahan, D., Zima, H.: Parallel programmability and the chapel language. J. High Perf. Comput. Appl. 21(3), 291–312 (2007)

    Article  Google Scholar 

  2. Charles, P., et al.: X10: An object-oriented approach to non-uniform cluster computing. In: Conference on Object-Oriented Programming, Systems, Languages, and Applications, pp. 519–538 (2005)

    Google Scholar 

  3. Clang: A C language family frontend for LLVM. http://clang.llvm.org

  4. Eichenberger, A.E., O’Brien, K.: Experimenting with low-overhead OpenMP runtime on IBM Blue Gene/Q. IBM J. Res. Dev. 57(1/2), 8:1–8:8 (2013)

    Article  Google Scholar 

  5. El-Ghazawi, T., Smith, L.: UPC: Unified parallel C. In: Supercomputing (2006)

    Google Scholar 

  6. Hoeflinger, J.P.: Extending OpenMP to clusters (2006)

    Google Scholar 

  7. Hu, Y., Lu, H., Cox, A.L., Zwaenepoel, W.: OpenMP for networks of SMPs. J. Parallel Distrib. Comput. 60(12), 1512–1530 (2000)

    Article  MATH  Google Scholar 

  8. Kogge, P.M.: Performance analysis of a large memory application on multiple architectures. In: Conference on Partitioned Global Address Space Programming Models (2013)

    Google Scholar 

  9. Li, M., et al.: Scaling distributed machine learning with the parameter server. In: Operating Systems Design and Implementation, pp. 583–598, October 2014

    Google Scholar 

  10. The LLVM Compiler Infrastructure. http://llvm.org

  11. Millot, D., Muller, A., Parrot, C., Silber-Chaussumier, F.: STEP: a distributed OpenMP for coarse-grain parallelism tool. In: Eigenmann, R., de Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 83–99. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Numrich, R.W., Reid, J.: Co-array fortran for parallel programming. SIGPLAN Fortran Forum 17(2), 1–31 (1998)

    Article  Google Scholar 

  13. Ojima, Y., Sato, M., Harada, H., Ishikawa, Y.: Performance of cluster-enabled OpenMP for the SCASH software distributed shared memory system. In: Cluster Computing and the Grid, pp. 450–456, May 2003

    Google Scholar 

  14. OpenMP Application Program Interface. http://www.openmp.org/

  15. OpenMP, A.R.B.: OpenMP version 4.0, May 2013

    Google Scholar 

  16. Rowstron, A., et al.: Nobody ever got fired for using hadoop on a cluster. In: Workshop on Hot Topics in Cloud Data Processing, pp. 2:1–2:5 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arpith C. Jacob .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Jacob, A.C. et al. (2015). Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach. In: Terboven, C., de Supinski, B., Reble, P., Chapman, B., Müller, M. (eds) OpenMP: Heterogenous Execution and Data Movements. IWOMP 2015. Lecture Notes in Computer Science(), vol 9342. Springer, Cham. https://doi.org/10.1007/978-3-319-24595-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24595-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24594-2

  • Online ISBN: 978-3-319-24595-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics