Extending OMPT to Support Grain Graphs

  • Peder Voldnes LangdalEmail author
  • Magnus Jahre
  • Ananya Muddukrishna
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10468)


The upcoming profiling API standard OMPT can describe almost all profiling events required to construct grain graphs, a recent visualization that simplifies OpenMP performance analysis. We propose OMPT extensions that provide the missing descriptions of task creation and parallel for-loop chunk scheduling events, making OMPT a sufficient, standard source for grain graphs. Our extensions adhere to OMPT design objectives and incur a low overhead for BOTS (up to 2% overhead) and SPEC OMP2012 (1%) programs. Although motivated by grain graphs, the events described by the extensions are general and can enable cost-effective, precise measurements in other profiling tools as well.


OMPT Performance analysis Performance visualization 



We are grateful to anonymous reviewers for comments that helped improve the paper. One reviewer generously pointed out that our proposed chunk callback can be applied in data race detection tools as well. We thank Joachim Protze (RWTH Aachen), Jonas Hahnfeld (RWTH Aachen), Sergei Shudler (TU Darmstadt) and Harald Servat (Intel) for helpful comments and suggestions regarding the extensions. This paper is partially funded by the TULIPP project, grant number 688403 from the EU Horizon 2020 Research and Innovation programme.


  1. 1.
    Bull, J.M.: Measuring synchronisation and scheduling overheads in OpenMP. In: Proceedings of First European Workshop on OpenMP, vol. 8, p. 49 (1999)Google Scholar
  2. 2.
    Bull, J.M., Reid, F., McDonnell, N.: A microbenchmark suite for OpenMP tasks. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 271–274. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-30961-8_24 CrossRefGoogle Scholar
  3. 3.
    Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRefGoogle Scholar
  4. 4.
    Drebes, A., Pop, A., Heydemann, K., Cohen, A.: Interactive visualization of cross-layer performance anomalies in dynamic task-parallel applications and systems. In: 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 274–283, April 2016Google Scholar
  5. 5.
    Drebes, A., Bréjon, J.-B., Pop, A., Heydemann, K., Cohen, A.: Language-centric performance analysis of OpenMP programs with aftermath. In: Maruyama, N., Supinski, B.R., Wahib, M. (eds.) IWOMP 2016. LNCS, vol. 9903, pp. 237–250. Springer, Cham (2016). doi: 10.1007/978-3-319-45550-1_17 CrossRefGoogle Scholar
  6. 6.
    Duran, A., Teruel, X., Ferrer, R., Martorell, X., Ayguade, E.: Barcelona OpenMP tasks suite: a set of benchmarks targeting the exploitation of task parallelism in OpenMP. In: Proceedings of the 2009 International Conference on Parallel Processing, ICPP 2009, pp. 124–131 (2009).
  7. 7.
    Eichenberger, A.E., Mellor-Crummey, J., Schulz, M., Wong, M., Copty, N., Dietrich, R., Liu, X., Loh, E., Lorenz, D.: OMPT: an OpenMP tools application programming interface for performance analysis. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 171–185. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40698-0_13 CrossRefGoogle Scholar
  8. 8.
    Intel: Intel VTune Amplifier Webpage, May 2017.
  9. 9.
    Itzkowitz, M., Mazurov, O., Copty, N., Lin, Y., Lin, Y.: An OpenMP runtime API for profiling. OpenMP ARB as an official ARB White Paper, vol. 314, pp. 181–190 (2007).,
  10. 10.
    Langdal, P.V.: Extending OMPT to support grain graphs - dataset, 6 2017.
  11. 11.
    Langdal, P.V.: Generating grain graphs using the OpenMP tools API. Technical report, NTNU (2017).
  12. 12.
    Langdal, P.V.: LLVM OpenMP TR4E Alpha Release, May 2017.
  13. 13.
    Mohr, B., Malony, A.D., Hoppe, H.C., Schlimbach, F., Haab, G., Hoeflinger, J., Shah, S.: A performance monitoring interface for OpenMP. In: Proceedings of the Fourth Workshop on OpenMP (EWOMP 2002), pp. 1001–1025 (2002)Google Scholar
  14. 14.
    Muddukrishna, A., Jonsson, P.A., Langdal, P.: anamud/mir-dev: MIR v1.0.0, March 2017.
  15. 15.
    Muddukrishna, A., Jonsson, P.A., Podobas, A., Brorsson, M.: Grain graphs: OpenMP performance analysis made easy. In: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2016, NY, USA, pp. 28:1–28:13 (2016).
  16. 16.
    Muddukrishna, A., Langdal, P.: anamud/grain-graphs: Grain Graphs v1.0.0, March 2017.
  17. 17.
    Muller, M.S., Baron, J., Brantley, W.C., Feng, H., Hackenberg, D., Henschel, R., Jost, G., Molka, D., Parrott, C., Robichaux, J., Shelepugin, P., van Waveren, M., Whitney, B., Kumaran, K.: SPEC OMP2012 - an application benchmark suite for parallel systems using OpenMP. In: Proceedings of the 8th International Conference on OpenMP in a Heterogeneous World, IWOMP 2012, pp. 223–236 (2012).
  18. 18.
    OMPT Tools Interface Group: LLVM OpenMP Runtime with Changes Towards TR4. GitHub (2017).
  19. 19.
    OpenMP Language Working Group: OpenMP Technical report 4: Version 5.0 Preview 1, November 2016.
  20. 20.
    Qawasmeh, A., Malik, A.M., Chapman, B.M.: Adaptive OpenMP task scheduling using runtime APIs and machine learning. In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pp. 889–895, December 2015Google Scholar
  21. 21.
    Qawasmeh, A., Malik, A., Chapman, B., Huck, K., Malony, A.: Open source task profiling by extending the OpenMP runtime API. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 186–199. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40698-0_14 CrossRefGoogle Scholar
  22. 22.
    Servat, H., Teruel, X., Llort, G., Duran, A., Giménez, J., Martorell, X., Ayguadé, E., Labarta, J.: On the instrumentation of OpenMP and OmpSs tasking constructs. In: Caragiannis, I., Alexander, M., Badia, R.M., Cannataro, M., Costan, A., Danelutto, M., Desprez, F., Krammer, B., Sahuquillo, J., Scott, S.L., Weidendorfer, J. (eds.) Euro-Par 2012. LNCS, vol. 7640, pp. 414–428. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-36949-0_47 CrossRefGoogle Scholar
  23. 23.
    Yoga, A., Nagarakatte, S., Gupta, A.: Parallel data race detection for task parallel programs with locks. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, NY, USA, pp. 833–845 (2016).

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Peder Voldnes Langdal
    • 1
    Email author
  • Magnus Jahre
    • 1
  • Ananya Muddukrishna
    • 1
  1. 1.Department of Computer and Information ScienceNorwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations