Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite

  • Philippe Virouleau
  • Pierrick Brunet
  • François Broquedis
  • Nathalie Furmento
  • Samuel Thibault
  • Olivier Aumage
  • Thierry Gautier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8766)

Abstract

The recent introduction of task dependencies in the OpenMP specification provides new ways of synchronizing tasks. Application programmers can now describe the data a task will read as input and write as output, letting the runtime system resolve fine-grain dependencies between tasks to decide which task should execute next. Such an approach should scale better than the excessive global synchronization found in most OpenMP 3.0 applications. As promising as it looks however, any new feature needs proper evaluation to encourage application programmers to embrace it. This paper introduces the KASTORS benchmark suite designed to evaluate OpenMP tasks dependencies. We modified state-of-the-art OpenMP 3.0 benchmarks and data-flow parallel linear algebra kernels to make use of tasks dependencies. Learning from this experience, we propose extensions to the current OpenMP specification to improve the expressiveness of dependencies. We eventually evaluate both the GCC/libGOMP and the CLANG/libIOMP implementations of OpenMP 4.0 on our KASTORS suite, demonstrating the interest of task dependencies compared to taskwait-based approaches.

Keywords

OpenMP task dependencies benchmarks runtime systems KASTORS 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.-A.: StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 863–874. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Fineberg, S., Frederickson, P., Lasinski, T., Schreiber, R., Simon, H., Venkatakrishnan, V., Weeratunga, S.: The NAS Parallel Benchmarks. Report RNR-94-007, Department of Mathematics and Computer Science, Emory University (March 1994)Google Scholar
  3. 3.
    Bienia, C.: Benchmarking Modern Multiprocessors. PhD thesis, Princeton University (January 2011)Google Scholar
  4. 4.
    Che, S., Sheaffer, J., Boyer, M., Szafaryn, L., Wang, L., Skadron, K.: A characterization of the rodinia benchmark suite with comparison to contemporary cmp workloads. In: 2010 IEEE International Symposium on Workload Characterization (IISWC), pp. 1–11 (December 2010)Google Scholar
  5. 5.
    Duran, A., Teruel, X., Ferrer, R., Martorell, X., Ayguade, E.: Barcelona openmp tasks suite: A set of benchmarks targeting the exploitation of task parallelism in openmp. In: International Conference on Parallel Processing, ICPP 2009, pp. 124–131. IEEE (2009)Google Scholar
  6. 6.
    Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: Ompss: a proposal for programming heterogeneous multi-core architectures. Parallel Processing Letters 21(02), 173–193 (2011)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Gautier, T., Besseron, X., Pigeon, L.: Kaapi: A thread scheduling runtime system for data flow computations on cluster of multi-processors. In: PASCO 2007 (2007)Google Scholar
  8. 8.
    Jin, H., der Wijngaart, R.F.V.: Performance characteristics of the multi-zone nas parallel benchmarks. In: IPDPS. IEEE Computer Society (2004)Google Scholar
  9. 9.
    Kurzak, J., Luszczek, P., YarKhan, A., Faverge, M., Langou, J., Bouwmeester, H., Dongarra, J.: Multithreading in the PLASMA Library, pp. 119–141. Chapman and Hall/CRC (2013)Google Scholar
  10. 10.
    Müller, M.S., et al.: Spec omp2012 – an application benchmark suite for parallel systems using openmp. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 223–236. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    OpenMP Architecture Review Board. OpenMP application program interface version 4.0 (July 2013)Google Scholar
  12. 12.
    YarKhan, A., Kurzak, J., Dongarra, J.: Quark users’ guide: Queueing and runtime for kernels. Technical report, Innovative Computing Laboratory, University of Tennessee (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Philippe Virouleau
    • 1
  • Pierrick Brunet
    • 1
  • François Broquedis
    • 4
  • Nathalie Furmento
    • 2
  • Samuel Thibault
    • 3
  • Olivier Aumage
    • 1
  • Thierry Gautier
    • 1
  1. 1.INRIAFrance
  2. 2.CNRSFrance
  3. 3.University of BordeauxFrance
  4. 4.MOAIS and RUNTIME Teams, Computer Science Laboratories of Grenoble and BordeauxGrenoble Institute of TechnologyFrance

Personalised recommendations