Abstract
The TAU Performance System® (TAU) is a powerful and highly versatile profiling and tracing tool ecosystem for performance engineering of parallel programs. Developed over the last twenty years, TAU has evolved with each new generation of HPC systems and scales efficiently to hundreds of thousands of cores. TAU’s organic growth has resulted in a loosely coupled software toolbox such that novice users first encountering TAU’s complexity and vast array of features are often intimidated and easily frustrated. To lower the barrier to entry for novice TAU users, ParaTools and the US Department of Energy have developed “TAU Commander,” a performance engineering workflow manager that facilitates a systematic approach to performance engineering, guides users through common profiling and tracing workflows, and offers constructive feedback in case of error. This work compares TAU and TAU Commander workflows for common performance engineering tasks in OpenSHMEM applications and demonstrates workflows targeting two different SHMEM implementations, Intel Xeon “Haswell” and “Knights Landing” processors, direct and indirect measurement methods, callsite, profiles, and traces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that not producing any data is a valid experimental result, i.e. this particular experiment raises a fault in the application and the end goal is to use post-mortem debugging to determine the cause of the fault [19].
References
U.S. Department of Energy INCITE leadership computing, December 2015. http://www.doeleadershipcomputing.org/
Bader, D.A., Cong, G.: Fast shared-memory algorithms for computing the minimum spanning forest of sparse graphs. J. Par. Distrib. Comp. 66(11), 1366–1378 (2006). http://dx.doi.org/10.1016/j.jpdc.2006.06.001
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 3(14), 189–204 (2000)
Chapman, B., Curtis, T., Pophale, S., Poole, S., Kuehn, J., Koelbel, C., Smith, L.: Introducing OpenSHMEM: SHMEM for the PGAS community. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, PGAS 2010, pp. 2:1–2:3. ACM, New York (2010). http://doi.acm.org/10.1145/2020373.2020375
Francis, I., Drugan, C.: Groundbreaking astrophysics accelerated. HPC Source, February 2013
Geimer, M., Wolf, F., Wylie, B.J.N., Mohr, B.: Scalable parallel trace-based performance analysis. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) EuroPVM/MPI 2006. LNCS, vol. 4192, pp. 303–312. Springer, Heidelberg (2006). https://doi.org/10.1007/11846802_43
Hemstad, J., Hanebutte, U.R.: ISx: An integer sort mini-application for the exascale era (2015). Partitioned Global Address Space SC’15 Booth
Jose, J., Kandalla, K., Luo, M., Panda, D.: Supporting hybrid MPI and OpenSHMEM over infiniband: design and performance evaluation. In: The 41st International Conference on Parallel Processing (ICPP), pp. 219–228 (2012)
Knupfer, A., Brunst, H., Nagel, W.: High performance event trace visualization. In: Proceedings of Parallel and Distributed Processing (PDP). IEEE (2005)
Linford, J.C.: TAU commander developer documentation, June 2017. http://paratoolsinc.github.io/taucmdr/
Linford, J.C., Vadlamani, S., Shende, S., Malony, A.D., Jones, W., Anderson, W.K., Nielsen, E.: Performance engineering FUN3D at scale with TAU Commander. In: Proceedings of the ACM/IEEE The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2016), November 2016. To Appear
Malony, A., Biersdorff, S., Shende, S., Jagode, H., Tomov, S., Juckeland, G., Dietrich, R., Poole, D., Lamb, C.: Parallel performance measurement of heterogeneous parallel systems with GPUs. In: 2011 International Conference on Parallel Processing (ICPP), pp. 176–185, September 2011
Malony, A.D., Mellor-Crummey, J., Shende, S.S.: Measurement and analysis of parallel program performance using TAU and HPCToolkit. In: Performance Tuning of Scientific Applications. CRC Press, New York, November 2010
ParaTools, Inc.: TAU Commander: An intuitive interface for the TAU Performance Analysis System (2014). https://www.sbir.gov/sbirsearch/detail/687037
Perez, J., Shende, S.: Furthering the understanding of coronal heating and solar wind origin. Technical report, Argonne National Labs, January 2013
Pophale, S., Nanjegowda, R., Curtis, T., Chapman, B., Jin, H., Poole, S., Kuehn, J.: OpenSHMEM performance and potential: a NPB experimental study. In: The 6th Conference on Partitioned Global Address Space Programming Models (PGAS 2012) (2012)
Seager, K., Choi, S.-E., Dinan, J., Pritchard, H., Sur, S.: Design and implementation of OpenSHMEM using OFI on the aries interconnect. In: Venkata, M.G., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 97–113. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_7
Shende, S., Malony, A.: The TAU Parallel Performance System. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)
Shende, S., Malony, A., Linford, J., Wissink, A., Adamec, S.: Isolating runtime faults with callstack debugging using TAU. In: Proceedings of the HPEC 2012 Conference (2012)
Acknowledgments
This work is supported by the United States Department of Energy under DOE SBIR grant DE-SC0009593. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Linford, J.C., Khuvis, S., Shende, S., Malony, A., Imam, N., Venkata, M.G. (2018). Performance Analysis of OpenSHMEM Applications with TAU Commander. In: Gorentla Venkata, M., Imam, N., Pophale, S. (eds) OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence. OpenSHMEM 2017. Lecture Notes in Computer Science(), vol 10679. Springer, Cham. https://doi.org/10.1007/978-3-319-73814-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-73814-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73813-0
Online ISBN: 978-3-319-73814-7
eBook Packages: Computer ScienceComputer Science (R0)