Advertisement

Extending OpenMP Metadirective Semantics for Runtime Adaptation

  • Yonghong YanEmail author
  • Anjia Wang
  • Chunhua Liao
  • Thomas R. W. Scogland
  • Bronis R. de Supinski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11718)

Abstract

OpenMP 5.0 introduces the metadirective to support selection from a set of directive variants based on the OpenMP context, which is composed of traits from active OpenMP constructs, devices, implementations or user-defined conditions. OpenMP 5.0 restricts the selection to be determined at compile time, which requires that all traits must be compile-time constants. Our analysis of real applications indicates that this restriction has its limitation, and we explore extension of user-defined contexts to support variant selection at runtime. We use the Smith-Waterman algorithm as an example to show the need for adaptive selection of parallelism and devices at runtime, and present a prototype implemented in the ROSE compiler. Given a large range of input sizes, our experiments demonstrate that one of the adaptive versions of Smith-Waterman always chooses the parallelism and device that delivers the best performance, with improvements between 20% and 200% compared to non-adaptive versions that use the other approaches.

Keywords

OpenMP 5.0 Metadirective Dynamic context 

Notes

Acknowledgment

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, and supported by the U.S. Dept. of Energy, Office of Science, Advanced Scientific Computing Research (SC-21), under contract DE-AC02-06CH11357. The manual reference codes were supported by LLNL-LDRD 18-ERD-006. LLNL-CONF-774899. This material is also based upon work supported by the National Science Foundation under Grant No. 1833332 and 1652732.

References

  1. 1.
    Liao, C., Quinlan, D.J., Panas, T., de Supinski, B.R.: A ROSE-based OpenMP 3.0 research compiler supporting multiple runtime libraries. In: Sato, M., Hanawa, T., Müller, M.S., Chapman, B.M., de Supinski, B.R. (eds.) IWOMP 2010. LNCS, vol. 6132, pp. 15–28. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-13217-9_2CrossRefGoogle Scholar
  2. 2.
    Liao, C., Yan, Y., de Supinski, B.R., Quinlan, D.J., Chapman, B.: Early experiences with the OpenMP accelerator model. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 84–98. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40698-0_7CrossRefGoogle Scholar
  3. 3.
    Liu, Y., Huang, W., Johnson, J., Vaidya, S.: GPU accelerated Smith-Waterman. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3994, pp. 188–195. Springer, Heidelberg (2006).  https://doi.org/10.1007/11758549_29CrossRefGoogle Scholar
  4. 4.
    de O Sandes, E., de Melo, A.: Smith-Waterman alignment of huge sequences with GPU in linear space. In: 2011 IEEE International Parallel Distributed Processing Symposium, pp. 1199–1211, May 2011.  https://doi.org/10.1109/IPDPS.2011.114
  5. 5.
    OpenMP Architecture Review Board: OpenMP Application Programming Interface 5.0, November 2018. https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf
  6. 6.
    Pennycook, S.J., Sewall, J.D., Hammond, J.R.: Evaluating the impact of proposed OpenMP 5.0 features on performance, portability and productivity. In: 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 37–46, November 2018.  https://doi.org/10.1109/P3HPC.2018.00007
  7. 7.
    Quinlan, D., Liao, C.: The ROSE source-to-source compiler infrastructure. In: Cetus Users and Compiler Infrastructure Workshop, in Conjunction with PACT, vol. 2011, p. 1. (2011)Google Scholar
  8. 8.
    Smith, T.F., Waterman, M.S., et al.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)CrossRefGoogle Scholar
  9. 9.
    Xiao, S., Aji, A.M., Feng, W.C.: On the robust mapping of dynamic programming onto a graphics processing unit. In: 2009 15th International Conference on Parallel and Distributed Systems, pp. 26–33. IEEE (2009)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yonghong Yan
    • 1
    Email author
  • Anjia Wang
    • 1
  • Chunhua Liao
    • 2
  • Thomas R. W. Scogland
    • 2
  • Bronis R. de Supinski
    • 2
  1. 1.University of South CarolinaColumbiaUSA
  2. 2.Lawrence Livermore National LaboratoryLivermoreUSA

Personalised recommendations