Abstract
Achieving good scalability from parallel codes is becoming increasingly difficult due to the hardware becoming more and more complex. Performance tools help developers but their use is sometimes complicated and very iterative. In this paper we propose a simple methodology for assessing the scalability and for detecting performance problems in an OpenMP application. This methodology is implemented in a performance analysis tool named ScalOMP that relies on the capabilities of OMPT for analyzing OpenMP applications. ScalOMP reports the code regions with scalability issues and suggests optimization strategies for those issues. The evaluation shows that ScalOMP incurs low overhead and that its suggestions lead to significant performance improvement of several OpenMP applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Coral-2 benchmarks. Technical report, Lawrence Livermore National Lab. (LLNL), Livermore, CA, USA. https://asc.llnl.gov/coral-2-benchmarks/index.php
Coral benchmarks. Technical report, Lawrence Livermore National Lab. (LLNL), Livermore, CA, USA. https://asc.llnl.gov/CORAL-benchmarks/
NAS parallel benchmarks applications (NPB). Technical report, NASA Advanced Supercomputing Division. https://www.nas.nasa.gov/publications/npb.html
Barthou, D., Rubial, A.C., Jalby, W., Koliai, S., Valensi, C.: Performance tuning of x86 OpenMP codes with MAQAO. In: Müller, M., Resch, M., Schulz, A., Nagel, W. (eds.) TTools for High Performance Computing 2009, pp. 95–113. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11261-4_7
Bohme, D., Geimer, M., Wolf, F., Arnold, L.: Identifying the root causes of wait states in large-scale parallel applications. In: 2010 39th International Conference on Parallel Processing, pp. 90–100 (2010)
Calotoiu, A., Hoefler, T., Poke, M., Wolf, F.: Using automated performance modeling to find scalability bugs in complex codes. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 45 (2013)
Coarfa, C., Mellor-Crummey, J.M., Froyd, N., Dotsenko, Y.: Scalability analysis of SPMD codes using expectations. In: Proceedings of the 21th Annual International Conference on Supercomputing, ICS 2007, Seattle, Washington, USA, 17–21 June 2007, pp. 13–22 (2007)
Coulomb, K., Degomme, A., Faverge, M., Trahay, F.: An open-source tool-chain for performance analysis. Tools High Perform. Comput. 2011, 37–48 (2012)
Ghane, M., Malik, A.M., Chapman, B., Qawasmeh, A.: False sharing detection in OpenMP applications using OMPT API. In: International Workshop on OpenMP, pp. 102–114 (2015)
Guerraoui, R., Guiroux, H., Lachaize, R., Quéma, V., Trigonakis, V.: Lock-unlock: is that all? A pragmatic analysis of locking in software systems. ACM Trans. Comput. Syst. (TOCS) 36(1), 1 (2019)
Huck, K.A., Malony, A.D., Shende, S., Jacobsen, D.W.: Integrated measurement for cross-platform OpenMP performance analysis. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 146–160. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11454-5_11
Iwainsky, C., et al.: How many threads will be too many? On the scalability of OpenMP implementations. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 451–463. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48096-0_35
Karlin, I., Keasler, J., Neely, J.: LULESH 2.0 updates and changes. Technical report, Lawrence Livermore National Lab. (LLNL), Livermore, CA, USA (2013)
Knüpfer, A., et al.: Score-P: a joint performance measurement run-time infrastructure for periscope, Scalasca, Tau, and Vampir. In: Brunst, H., Müller, M., Nagel, W., Resch, M. (eds.) Tools for High Performance Computing 2011, pp. 79–91. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31476-6_7
Müller, M.S., et al.: Developing scalable applications with Vampir, Vampirserver and Vampirtrace. In: Parallel Computing (PARCO), vol. 15, pp. 637–644 (2007)
Putigny, B., Goglin, B., Barthou, D.: A benchmark-based performance model for memory-bound HPC applications. In: 2014 International Conference on High Performance Computing & Simulation (HPCS), pp. 943–950 (2014)
Reinders, J.: VTune performance analyzer essentials (2005)
Woodyard, M.: An experimental model to analyze OpenMP applications for system utilization. In: Chapman, B.M., Gropp, W.D., Kumaran, K., Müller, M.S. (eds.) IWOMP 2011. LNCS, vol. 6665, pp. 22–36. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21487-5_3
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Daumen, A., Carribault, P., Trahay, F., Thomas, G. (2019). ScalOMP: Analyzing the Scalability of OpenMP Applications. In: Fan, X., de Supinski, B., Sinnen, O., Giacaman, N. (eds) OpenMP: Conquering the Full Hardware Spectrum. IWOMP 2019. Lecture Notes in Computer Science(), vol 11718. Springer, Cham. https://doi.org/10.1007/978-3-030-28596-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-28596-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28595-1
Online ISBN: 978-3-030-28596-8
eBook Packages: Computer ScienceComputer Science (R0)