Abstract
We present AfterOMPT, a new trace-based tool for analyzing the execution of OpenMP applications using the OMPT interface to capture accurate information on loop partitioning, distribution of iteration spaces across workers, task scheduling, and synchronization events. In contrast to previous works that rely on specific, instrumented runtime libraries, our tool is able to collect information from any runtime implementing the OMPT interface. In order to visualize the information from the collected traces, we have extended the Aftermath performance analysis tool with appropriate renderers for OMPT events. We also propose an extension of the OMPT interface for the collection of more detailed information on scheduled OpenMP loops. Experimental results show a tracing overhead of under \(5\%\) for the majority of studied benchmarks, increasing more significantly for those with highly fine-grained workloads.
This work was supported by the grant EuroEXA H2020-754337. Antoniu Pop is funded by the RAEng University Research Fellowship. Igor Wodiany is supported by the Department of Computer Science Kilburn Scholarship and the University of Manchester Presidents Award.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Artifacts and sources available at: https://github.com/IgWod/ompt-loops-tracing.
- 3.
Partial verification of this changed dataset fails as it relies on pre-defined ranks for keys at specific locations, but full verification passes, so that we can assume that the algorithm executes correctly.
- 4.
- 5.
The compilation error is caused by the potential bug in the unofficial C port of the benchmarks and does not appear in the official Fortran implementation.
References
Extrae. https://tools.bsc.es/extrae. Accessed 25 May 2020
Intel VTune Profiler. https://software.intel.com/content/www/us/en/develop/tools/vtune-profiler.html. Accessed 25 May 2020
Bailey, D.H.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)
Drebes, A., Bréjon, J.-B., Pop, A., Heydemann, K., Cohen, A.: Language-centric performance analysis of OpenMP programs with aftermath. In: Maruyama, N., de Supinski, B.R., Wahib, M. (eds.) IWOMP 2016. LNCS, vol. 9903, pp. 237–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45550-1_17
Drebes, A., Pop, A., Heydemann, K., Cohen, A.: Interactive visualization of cross-layer performance anomalies in dynamic task-parallel applications and systems. In: 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 274–283. IEEE (2016)
Duran, A., Teruel, X., Ferrer, R., Martorell, X., Ayguade, E.: Barcelona OpenMP tasks suite: a set of benchmarks targeting the exploitation of task parallelism in OpenMP. In: 2009 International Conference on Parallel Processing, pp. 124–131. IEEE (2009)
Eichenberger, A.E., et al.: OMPT: an OpenMP tools application programming interface for performance analysis. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 171–185. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40698-0_13
Feld, C., Convent, S., Hermanns, M.-A., Protze, J., Geimer, M., Mohr, B.: Score-P and OMPT: navigating the perils of callback-driven parallel runtime introspection. In: Fan, X., de Supinski, B.R., Sinnen, O., Giacaman, N. (eds.) IWOMP 2019. LNCS, vol. 11718, pp. 21–35. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28596-8_2
Itzkowitz, M., Mazurov, O., Copty, N., Lin, Y., Lin, Y.: An OpenMP runtime API for profiling. OpenMP ARB White Paper (2007). http://www.compunity.org/futures/omp-api.html
Langdal, P.V., Jahre, M., Muddukrishna, A.: Extending OMPT to support grain graphs. In: de Supinski, B.R., Olivier, S.L., Terboven, C., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2017. LNCS, vol. 10468, pp. 141–155. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65578-9_10
Lorenz, D., Dietrich, R., Tschüter, R., Wolf, F.: A comparison between OPARI2 and the OpenMP tools interface in the context of Score-P. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 161–172. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11454-5_12
Muddukrishna, A., Jonsson, P.A., Podobas, A., Brorsson, M.: Grain graphs: OpenMP performance analysis made easy. In: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 1–13. ACM (2016)
Müller, M.S., et al.: Developing scalable applications with Vampir. VampirServer and VampirTrace. In: PARCO, vol. 15, pp. 637–644 (2007)
Neill, R., Drebes, A., Pop, A.: Accurate and complete hardware profiling for OpenMP. In: de Supinski, B.R., Olivier, S.L., Terboven, C., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2017. LNCS, vol. 10468, pp. 266–280. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65578-9_18
OpenMP Architecture Review Board: OpenMP Application Programming Interface (Version 5.0) (2018)
Pillet, V., Labarta, J., Cortes, T., Girona, S.: Paraver: a tool to visualize and analyze parallel code. In: Proceedings of WoTUG-18: Transputer and OCCAM Developments, vol. 44, pp. 17–31 (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wodiany, I., Drebes, A., Neill, R., Pop, A. (2020). AfterOMPT: An OMPT-Based Tool for Fine-Grained Tracing of Tasks and Loops. In: Milfeld, K., de Supinski, B., Koesterke, L., Klinkenberg, J. (eds) OpenMP: Portable Multi-Level Parallelism on Modern Systems. IWOMP 2020. Lecture Notes in Computer Science(), vol 12295. Springer, Cham. https://doi.org/10.1007/978-3-030-58144-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-58144-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58143-5
Online ISBN: 978-3-030-58144-2
eBook Packages: Computer ScienceComputer Science (R0)