Skip to main content

Fine-Tuning an OpenMP-Based TVD–Hopmoc Method Using Intel® Parallel Studio XE Tools on Intel® Xeon® Architectures

  • Conference paper
  • First Online:
High Performance Computing (CARLA 2018)

Abstract

This paper is concerned with parallelizing the TVD–Hopmoc method for numerical time integration of evolutionary differential equations. Using Intel® Parallel Studio XE tools, we studied three OpenMP implementations of the TVD–Hopmoc method (naive, CoP and EWS-Sync), with executions performed on Intel® Xeon® Many Integrated Core Architecture and Scalable processor. Our implementation, named EWS-Sync, defines an array that represents threads and the scheme consists of synchronizing only adjacent threads. Moreover, this approach reduces the OpenMP scheduling time by employing an explicit work-sharing strategy. Instead of permitting the OpenMP API to perform thread scheduling implicitly, this implementation of the 1-D TVD-Hopmoc method partitions among threads the array that represents the computational mesh of the numerical method. Thereby, this scheme diminishes the OpenMP spin time by avoiding barriers using an explicit synchronization mechanism where a thread only waits for its two adjacent threads. Numerical simulations show that this approach achieves promising performance gains in shared memory for multi-core and many-core environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Holstad, A.: The Koren upwind scheme for variable gridsize. Appl. Numer. Math. 37, 459–487 (2001)

    Article  MathSciNet  Google Scholar 

  2. Oliveira, S.R.F., Gonzaga de Oliveira, S.L., Kischinhevsky, M.: Convergence analysis of the Hopmoc method. Int. J. Comput. Math. 86, 1375–1393 (2009)

    Article  MathSciNet  Google Scholar 

  3. Cabral, F.L., Osthoff, C., Costa, G., Gonzaga de Oliveira, S.L., Brandão, D.N., Kischinhevsky, M.: Tuning up TVD HOPMOC method on Intel MIC Xeon Phi architectures with Intel Parallel Studio Tools. In: Proceedings of the 8th Workshop on Applications for Multi-Core Architectures (2017)

    Google Scholar 

  4. Harten, A.: High resolution schemes for hyperbolic conservation laws. J. Comput. Phys. 49, 357–393 (1983)

    Article  MathSciNet  Google Scholar 

  5. Brandão, D.N., Gonzaga de Oliveira, S.L., Kischinhevsky, M., Osthoff, C., Cabral, F.: A total variation diminishing Hopmoc scheme for numerical time integration of evolutionary differential equations. In: Gervasi, O., et al. (eds.) ICCSA 2018, Part I. LNCS, vol. 10960, pp. 53–66. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95162-1_4

    Chapter  Google Scholar 

  6. Cabral, F.L., Osthoff, C., Costa, G.P., Gonzaga de Oliveira, S.L., Brandão, D., Kischinhevsky, M.: An OpenMP implementation of the TVD–hopmoc method based on a synchronization mechanism using locks between adjacent threads on Xeon Phi (TM) accelerators. In: Shi, Y., et al. (eds.) ICCS 2018. LNCS, vol. 10862, pp. 701–707. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93713-7_67

    Chapter  Google Scholar 

  7. Burton, F.W., Sleep, M.R.: Executing functional programs on a virtual tree of processors. In: Proceedings of the 1981 Conference on Functional Programming Languages and Computer Architecture, Portsmouth, N.H., pp. 187–194. ACM, New York, October 1981

    Google Scholar 

  8. Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. J. ACM (JACM) 46(5), 720–748 (1999)

    Article  MathSciNet  Google Scholar 

  9. Penna, P.H., Castro, M., Plentz, P., Freitas, H.C., Broquedis, F., Mehaut, J.F.: BinLPT: a novel worload-aware loop scheduler for irregular parallel loops. Braz. Simp. High Perfom. Comput. 11, 527–536 (2017)

    Google Scholar 

  10. Ma, H., Zhao, R., Gao, X., Zhang, Y.: Barrier optimization for OpenMP program. In: Proceedings of 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking, Parallel and Distributed Computing, pp. 495–500 (2009)

    Google Scholar 

  11. Caballero, D., Duran, A., Martorell, X.: An OpenMP* barrier using SIMD instructions for Intel® Xeon PhiTM coprocessor. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 99–113. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40698-0_8

    Chapter  Google Scholar 

  12. Cabral, F.L., Osthoff, C., Kischinhevsky, M., Brandão, D.: Hybrid MPI/OpenMP/OpenACC implementations for the solution of convection diffusion equations with Hopmoc method. In: Proceedings of 14th International Conference on Computational Science and Its Applications (ICCSA), pp. 196–199 (2014)

    Google Scholar 

  13. Intel. Clockticks per Instructions Retired (CPI). https://software.intel.com/en-us/vtune-amplifier-help-clockticks-per-instructions-retired-cpi. Accessed 30 Nov 2017

Download references

Acknowledgments

CNPq, CAPES, and FAPERJ supported this work. We would like to thank the Núcleo de Computação Científica at Universidade Estadual Paulista (NCC/UNESP) for letting us execute our simulations on its heterogeneous multi-core cluster. These resources were partially funded by Intel® through the projects entitled Intel Parallel Computing Center, Modern Code Partner, and Intel/Unesp Center of Excellence in Machine Learning.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carla Osthoff .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cabral, F.L. et al. (2019). Fine-Tuning an OpenMP-Based TVD–Hopmoc Method Using Intel® Parallel Studio XE Tools on Intel® Xeon® Architectures. In: Meneses, E., Castro, H., Barrios Hernández, C., Ramos-Pollan, R. (eds) High Performance Computing. CARLA 2018. Communications in Computer and Information Science, vol 979. Springer, Cham. https://doi.org/10.1007/978-3-030-16205-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-16205-4_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-16204-7

  • Online ISBN: 978-3-030-16205-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics