Advertisement

OMPSan: Static Verification of OpenMP’s Data Mapping Constructs

  • Prithayan BaruaEmail author
  • Jun Shirako
  • Whitney Tsang
  • Jeeva Paudel
  • Wang Chen
  • Vivek Sarkar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11718)

Abstract

OpenMP offers directives for offloading computations from CPU hosts to accelerator devices such as GPUs. A key underlying challenge is in efficiently managing the movement of data across the host and the accelerator. User experiences have shown that memory management in OpenMP programs with offloading capabilities is non-trivial and error-prone.

This paper presents OMPSan (OpenMP Sanitizer) – a static analysis-based tool that helps developers detect bugs from incorrect usage of the map clause, and also suggests potential fixes for the bugs. We have developed an LLVM based data flow analysis that validates if the def-use information of the array variables are respected by the mapping constructs in the OpenMP program. We evaluate OmpSan over some standard benchmarks and also show its effectiveness by detecting commonly reported bugs.

Keywords

OpenMP offloading OpenMP target data mapping LLVM Memory management Static analysis Verification Debugging 

References

  1. 1.
    Aachen University: OpenMP Benchmark. https://github.com/RWTH-HPC/DRACC
  2. 2.
    Atzeni, S., et al.: Archer: effectively spotting data races in large OpenMP applications. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 53–62, May 2016.  https://doi.org/10.1109/IPDPS.2016.68
  3. 3.
    Eichenberger, A., et al.: OMPT and OMPD: OpenMP tools application programming interfaces for performance analysis and debugging. In: International Workshop on OpenMP (IWOMP 2013) (2013)Google Scholar
  4. 4.
    Eichenberger, A.E., et al.: OMPT: an OpenMP tools application programming interface for performance analysis. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 171–185. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40698-0_13CrossRefGoogle Scholar
  5. 5.
    Jablin, T.B., Jablin, J.A., Prabhu, P., Liu, F., August, D.I.: Dynamically managed data for CPU-GPU architectures. In: Proceedings of the Tenth International Symposium on Code Generation and Optimization, CGO 2012, pp. 165–174. ACM, New York (2012).  https://doi.org/10.1145/2259016.2259038
  6. 6.
    Jablin, T.B., Prabhu, P., Jablin, J.A., Johnson, N.P., Beard, S.R., August, D.I.: Automatic CPU-GPU communication management and optimization. SIGPLAN Not. 46(6), 142–151 (2011).  https://doi.org/10.1145/1993316.1993516CrossRefGoogle Scholar
  7. 7.
    Knobe, K., Sarkar, V.: Array SSA form and its use in parallelization. In: Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1998, pp. 107–120. ACM, New York (1998).  https://doi.org/10.1145/268946.268956
  8. 8.
    Lee, S., Li, D., Vetter, J.S.: Interactive program debugging and optimization for directive-based, efficient GPU computing. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 481–490, May 2014.  https://doi.org/10.1109/IPDPS.2014.57
  9. 9.
    Lee, S., Eigenmann, R.: OpenMPC: extended OpenMP programming and tuning for GPUs. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010, pp. 1–11. IEEE Computer Society, Washington, DC (2010).  https://doi.org/10.1109/SC.2010.36
  10. 10.
  11. 11.
    Mendonça, G., Guimarães, B., Alves, P., Pereira, M., Araújo, G., Pereira, F.M.Q.: DawnCC: automatic annotation for data parallelism and offloading. ACM Trans. Arch. Code Optim. 14(2), 13:1–13:25 (2017).  https://doi.org/10.1145/3084540CrossRefGoogle Scholar
  12. 12.
    Novillo, D.: Memory SSA - a unified approach for sparsely representing memory operations. In: Proceedings of the GCC Developers’ Summit (2007)Google Scholar
  13. 13.
    Pai, S., Govindarajan, R., Thazhuthaveetil, M.J.: Fast and efficient automatic memory management for GPUs using compiler-assisted runtime coherence scheme. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, PACT 2012, pp. 33–42. ACM, New York (2012).  https://doi.org/10.1145/2370816.2370824

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Prithayan Barua
    • 1
    Email author
  • Jun Shirako
    • 1
  • Whitney Tsang
    • 2
  • Jeeva Paudel
    • 2
  • Wang Chen
    • 2
  • Vivek Sarkar
    • 1
  1. 1.Georgia Institute of TechnologyAtlantaGeorgia
  2. 2.IBM Toronto LaboratoryMarkhamCanada

Personalised recommendations