Advertisement

An OpenMP Implementation of the TVD–Hopmoc Method Based on a Synchronization Mechanism Using Locks Between Adjacent Threads on Xeon Phi (TM) Accelerators

  • Frederico L. Cabral
  • Carla OsthoffEmail author
  • Gabriel P. Costa
  • Sanderson L. Gonzaga de Oliveira
  • Diego Brandão
  • Mauricio Kischinhevsky
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10862)

Abstract

This work focuses on the study of the 1–D TVD–Hopmoc method executed in shared memory manycore environments. In particular, this paper studies barrier costs on Intel\(^{\textregistered }\) Xeon Phi\(^\mathrm{TM}\) (KNC and KNL) accelerators when using the OpenMP standard. This paper employs an explicit synchronization mechanism to reduce spin and thread scheduling times in an OpenMP implementation of the 1–D TVD–Hopmoc method. Basically, we define an array that represents threads and the new scheme consists of synchronizing only adjacent threads. Moreover, the new approach reduces the OpenMP scheduling time by employing an explicit work-sharing strategy. In the beginning of the process, the array that represents the computational mesh of the numerical method is partitioned among threads, instead of permitting the OpenMP API to perform this task. Thereby, the new scheme diminishes the OpenMP spin time by avoiding OpenMP barriers using an explicit synchronization mechanism where a thread only waits for its two adjacent threads. The results of the new approach is compared with a basic parallel implementation of the 1–D TVD–Hopmoc method. Specifically, numerical simulations shows that the new approach achieves promising performance gains in shared memory manycore environments for an OpenMP implementation of the 1–D TVD–Hopmoc method.

Notes

Acknowledgments

This work was developed with the support of CNPq, CAPES, and FAPERJ - Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro. We would like to thank the Núcleo de Computação Científica at Universidade Estadual Paulista (NCC/UNESP) for letting us execute our simulations on its heterogeneous multi-core cluster. These resources were partially funded by Intel\(^{\textregistered }\) through the projects entitled Intel Parallel Computing Center, Modern Code Partner, and Intel/Unesp Center of Excellence in Machine Learning.

References

  1. 1.
    Cabral, F., Osthoff, C., Costa, G., Brandão, D.N., Kischinhevsky, M., Gonzaga de Oliveira, S.L.: Tuning up TVD HOPMOC method on Intel MIC Xeon Phi architectures with Intel Parallel Studio tools. In: Proceedings of the 8th Workshop on Applications for Multi-Core Architectures, Campinas, SP, Brazil (2017)Google Scholar
  2. 2.
    Dagum, L., Menon, R.: OpenMP: an industry standard api for shared-memory programming. Comput. Sci. Eng. IEEE 5(1), 46–55 (1998)CrossRefGoogle Scholar
  3. 3.
    Oliveira, S.R.F., Gonzaga de Oliveira, S.L., Kischinhevsky, M.: Convergence analysis of the Hopmoc method. Int. J. Comput. Math. 86, 1375–1393 (2009)MathSciNetCrossRefGoogle Scholar
  4. 4.
    van Leer, B.: Towards the ultimate conservative difference schemes. J. Comput. Phys. 14, 361–370 (1974)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Frederico L. Cabral
    • 1
  • Carla Osthoff
    • 1
    Email author
  • Gabriel P. Costa
    • 1
  • Sanderson L. Gonzaga de Oliveira
    • 2
  • Diego Brandão
    • 3
  • Mauricio Kischinhevsky
    • 4
  1. 1.Laboratório Nacional de Computação Científica - LNCCPetrópolisBrazil
  2. 2.Universidade Federal de Lavras - UFLALavrasBrazil
  3. 3.Centro Federal de Educação Tecnológica Celso Suckow da Fonseca - CEFET-RJRio de JaneiroBrazil
  4. 4.Universidade Federal Fluminense - UFFRio de JaneiroBrazil

Personalised recommendations