Diagnosing Highly-Parallel OpenMP Programs with Aggregated Grain Graphs

  • Nico Reissmann
  • Ananya MuddukrishnaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11014)


Grain graphs simplify OpenMP performance analysis by visualizing performance problems from a fork-join perspective that is familiar to programmers. However, when programmers decide to expose a high amount of parallelism by creating thousands of task and parallel for-loop chunk instances, the resulting grain graph becomes large and tedious to understand. We present an aggregation method that hierarchically groups related nodes together to reduce grain graphs of any size to one single node. This aggregated graph is then navigated by progressively uncovering groups and following visual clues that guide programmers towards problems while hiding non-problematic regions. Our approach enhances productivity by enabling programmers to understand problems in highly-parallel OpenMP programs with less effort than before.



The paper was funded by the TULIPP project (grant number 688403) and the READEX project (grant number 671657) from the EU Horizon 2020 Research and Innovation programme. The authors thank NTNU colleagues Peder Voldnes Langdal, Magnus Själander, Jan Christian Meyer, and Magnus Jahre for constructive comments and KTH Royal Institute of Technology for providing test machinery.


  1. 1.
    Muddukrishna, A., et al.: Grain graphs: OpenMP performance analysis made easy. In: PPoPP (2016)Google Scholar
  2. 2.
    Olivier, S.L., et al.: Characterizing and mitigating work time inflation in task parallel programs. In: SC (2012)Google Scholar
  3. 3.
    Yoo, R.M., et al.: Locality-aware task management for unstructured parallelism: a quantitative limit study. In: SPAA (2013)Google Scholar
  4. 4.
    Muddukrishna, A., et al.: Locality-aware task scheduling and data distribution for OpenMP programs on NUMA systems and manycore processors. Sci. Program. 2015 (2015). Article no. 5CrossRefGoogle Scholar
  5. 5.
    Isaacs, K.E., et al.: Combing the communication hairball: visualizing large-scale parallel execution traces using logical time. In: InfoVis (2014)Google Scholar
  6. 6.
    Cuny, J.E., et al.: Logical time in visualizations produced by parallel programs. In: IEEE Conference on Visualization (1992)Google Scholar
  7. 7.
    Sugiyama, K., et al.: Methods for visual understanding of hierarchical system structures. SMC 11, 109–125 (1981)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Eiglsperger, M., et al.: An efficient implementation of Sugiyama’s algorithm for layered graph drawing. In: International Symposium on Graph Drawing (2004)Google Scholar
  9. 9.
    Shervashidze, N., et al.: Weisfeiler-Lehman graph kernels. JMLR 12, 2539–2561 (2011)Google Scholar
  10. 10.
    Muddukrishna, A., et al.: anamud/grain-graphs: Grain Graphs v1.0.0 (2017).
  11. 11.
    Langdal, P.V., Jahre, M., Muddukrishna, A.: Extending OMPT to support grain graphs. In: de Supinski, B.R., Olivier, S.L., Terboven, C., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2017. LNCS, vol. 10468, pp. 141–155. Springer, Cham (2017). Scholar
  12. 12.
    Muddukrishna, A., et al.: anamud/mir-dev: MIR v1.0.0 (2017).
  13. 13.
    Muddukrishna, A., et al.: Characterizing task-based OpenMP programs. PLoS ONE 10(4), e0123545 (2015). Scholar
  14. 14.
    Reissmann, N.: phate/ggraph: VPA17 (2017).
  15. 15.
    Brandes, U., et al.: GRAPHML primer (2017). Accessed 27 July 2017
  16. 16.
    Csardi, G., et al.: The igraph software package for complex network research. InterJournal 1695, 1–9 (2006)Google Scholar
  17. 17.
    yWorks GmBh: yEd Graph Editor (2015). Accessed 10 Apr 2015
  18. 18.
    Sugiyama, M., et al.: GraphKernels: R and python packages for graph comparison. Bioinformatics 34, 530–532 (2017)CrossRefGoogle Scholar
  19. 19.
    Isaacs, K.E., et al.: State of the art of performance visualization. In: EuroVis (2014)Google Scholar
  20. 20.
    Von Landesberger, T., et al.: Visual analysis of large graphs: state-of-the-art and future research challenges. In: Computer Graphics Forum (2011)Google Scholar
  21. 21.
    Katherine I.: Performance visualization: living digital library of state of the art of performance visualization (2017). Accessed 31 July 2017
  22. 22.
    Brinkmann, S., Gracia, J., Niethammer, C.: Task debugging with TEMANEJO. In: Cheptsov, A., Brinkmann, S., Gracia, J., Resch, M., Nagel, W. (eds.) Tools for High Performance Computing 2012, pp. 13–21. Springer, Heidelberg (2013). Scholar
  23. 23.
    Barcelona Supercomputing Center: OmpSs task dependency graph (2013). Accessed 10 Apr 2015
  24. 24.
    Subotic, V., et al.: Programmability and portability for exascale: top down programming methodology and tools with StarSs. J. Comput. Sci. 4, 450–456 (2013)CrossRefGoogle Scholar
  25. 25.
    Blochinger, W., et al.: Visualizing structural properties of irregular parallel computations. In: VISSOFT (2005)Google Scholar
  26. 26.
    Haugen, B., et al.: Visualizing execution traces with task dependencies. In: VPA (2015)Google Scholar
  27. 27.
    Drebes, A., Bréjon, J.-B., Pop, A., Heydemann, K., Cohen, A.: Language-centric performance analysis of OpenMP programs with aftermath. In: Maruyama, N., de Supinski, B.R., Wahib, M. (eds.) IWOMP 2016. LNCS, vol. 9903, pp. 237–250. Springer, Cham (2016). Scholar
  28. 28.
    Huynh, A., et al.: DAGViz: a DAG visualization tool for analyzing task-parallel program traces. In: VPA (2015)Google Scholar
  29. 29.
    Wheeler, K.B.: Visualizing massively multithreaded applications with ThreadScope. Concurr. Comput.: Pract. Exp. 22, 45–67 (2010)CrossRefGoogle Scholar
  30. 30.
    Zernik, D., et al.: Using visualization tools to understand concurrency. IEEE Softw. 9, 87–92 (1992)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Norwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations