Memory Usage Optimizations for Online Event Analysis

Hilbrich, Tobias; Protze, Joachim; Wagner, Michael; Müller, Matthias S.; Schulz, Martin; de Supinski, Bronis R.; Nagel, Wolfgang E.

doi:10.1007/978-3-319-15976-8_8

Tobias Hilbrich¹⁵,
Joachim Protze^16,17,
Michael Wagner¹⁵,
Matthias S. Müller^16,17,
Martin Schulz¹⁸,
Bronis R. de Supinski¹⁸ &
…
Wolfgang E. Nagel¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8759))

Included in the following conference series:

International Conference on Exascale Applications and Software

1610 Accesses

Abstract

Tools are essential for application developers and system support personnel during tasks such as performance optimization and debugging of massively parallel applications. An important class are event-based tools that analyze relevant events during the runtime of an application, e.g., function invocations or communication operations. We develop a parallel tools infrastructure that supports both the observation and analysis of application events at runtime. Some analyses—e.g., deadlock detection algorithms—require complex processing and apply to many types of frequently occurring events. For situations where the rate at which an application generates new events exceeds the processing rate of the analysis, we experience tool instability or even failures, e.g., memory exhaustion. Tool infrastructures must provide means to avoid or mitigate such situations. This paper explores two such techniques: first, a heuristic that selects events to receive and process next; second, a pause mechanism that temporarily suspends the execution of an application. An application study with applications from the SPEC MPI2007 benchmark suite and the NAS parallel benchmarks evaluates these techniques at up to \(16{,}384\) processes and illustrates how they avoid memory exhaustion problems that limited the applicability of a runtime correctness tool in the past.

The rights of this work are transferred to the extent transferable according to title 17 §105 U.S.C.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Uses numbers of processes that are a multiple of three.
2.
The lref data set operates with up to \(2{,}048\) processes (http://www.spec.org/mpi/docs/faq.html#DataSetL).

References

Arnold, D.C., Ahn, D.H., de Supinski, B.R., Lee, G.L., Miller, B.P., Schulz, M.: Stack trace analysis for large scale debugging. In: International Parallel and Distributed Processing Symposium (2007)
Google Scholar
Bailey, D.H., Dagum, L., Barszcz, E., Simon, H.D.: NAS parallel benchmark results. Technical report, IEEE Parallel and Distributed Technology (1992)
Google Scholar
Besnard, J.-B., Pérache, M., Jalby, W.: Event streaming for online performance measurements reduction. In: 42nd International Conference on Parallel Processing, ICPP 2013, pp. 985–994 (2013)
Google Scholar
Buntinas, D., Bosilca, G., Graham, R.L., Vallée, G., Watson, G.R.: A scalable tools communications infrastructure. In: Proceedings of the 2008 22nd International Symposium on High Performance Computing Systems and Applications, HPCS 2008, pp. 33–39. IEEE Computer Society, Washington (2008)
Google Scholar
Geimer, M., Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Mohr, B.: The Scalasca performance toolset architecture. Concurrency Comput. Pract. Exp. 22(6), 702–719 (2010)
Google Scholar
Gerndt, M., Fürlinger, K., Kereku, E.: Periscope: advanced techniques for performance analysis. In: Parallel Computing: Current and Future Issues of High-End Computing, Proceedings of the International Conference ParCo 2005, John von Neumann Institute for Computing Series, vol. 33. Central Institute for Applied Mathematics, Jülich (2005)
Google Scholar
Hilbrich, T., de Supinski, B.R., Nagel, W.E., Protze, J., Baier, C., Müller, M.S.: Distributed wait state tracking for runtime MPI deadlock detection. In: Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013, pp. 16:1–16:12. ACM, New York (2013)
Google Scholar
Hilbrich, T., Müller, M.S., de Supinski, B.R., Schulz, M., Nagel, W.E.: GTI: a generic tools infrastructure for event-based tools in parallel systems. In: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012, pp. 1364–1375. IEEE Computer Society, Washington (2012)
Google Scholar
Hilbrich, T., Müller, M.S., Schulz, M., de Supinski, B.R.: Order preserving event aggregation in TBONs. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 19–28. Springer, Heidelberg (2011)
Chapter Google Scholar
Hilbrich, T., Protze, J., de Supinski, B.R., Schulz, M., Müller, M.S., Nagel, W.E.: Intralayer communication for tree-based overlay networks. In: 42nd International Conference on Parallel Processing (ICPP), Fourth International Workshop on Parallel Software Tools and Tool Infrastructures, pp. 995–1003. IEEE Computer Society Press, Los Alamitos (2013)
Google Scholar
Ilsche, T., Schuchart, J., Cope, J., Kimpe, D., Jones, T., Knüpfer, A., Iskra, K., Ross, R., Nagel, W.E., Poole, S.: Enabling event tracing at leadership-class scale through I/O forwarding middleware. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2012, pp. 49–60. ACM, New York (2012)
Google Scholar
Jun, T.H., Watson, G.R.: Scalable Communication Infrastructure (2013). http://wiki.eclipse.org/PTP/designs/SCI Accessed 30 April 2013
Krell Institute. The Component Based Tool Infrastructure (2014). http://sourceforge.net/projects/cbtf/ Accessed 19 January 2014
Lee, G.L., Ahn, D.H., Arnold, D.C., de Supinski, B.R., Legendre, M., Miller, B.P., Schulz, M., Liblit, B.: Lessons learned at 208K: towards debugging millions of cores. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC 2008, pp. 26:1–26:9. IEEE Press, Piscataway (2008)
Google Scholar
Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, Version 3.0 (2012). http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf Accessed 27 November 2013
Müller, M.S., van Waveren, M., Lieberman, R., Whitney, B., Saito, H., Kumaran, K., Baron, J., Brantley, W.C., Parrott, C., Elken, T., Feng, H., Ponder, C.: SPEC MPI2007 - an application benchmark suite for parallel systems using MPI. Concurrency Comput. Pract. Exp. 22(2), 191–205 (2010)
Google Scholar
Nagel, W.E., Arnold, A., Weber, M., Hoppe, H.C., Solchenbach, K.: VAMPIR: visualization and analysis of MPI resources. Supercomputer 12(1), 69–80 (1996)
Google Scholar
Nataraj, A., Malony, A.D., Morris, A., Arnold, D.C., Miller, B.P.: A framework for scalable, parallel performance monitoring. Concurrency Comput. Pract. Exp. 22(6), 720–735 (2010)
Google Scholar
Noeth, M., Mueller, F., Schulz, M., de Supinski, B.R.: Scalable compression and replay of communication traces in massively parallel environments. In: IEEE International Parallel and Distributed Processing Symposium, IPDPS 2007, pp. 69–70 (2007)
Google Scholar
Roth, P.C., Arnold, D.C., Miller, B.P.: MRNet: a software-based multicast/reduction network for scalable tools. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003. ACM, New York (2003)
Google Scholar
Wagner, M., Knüpfer, A., Nagel, W.E.: Hierarchical memory buffering techniques for an in-memory event tracing extension to the open trace format 2. In: 42nd International Conference on Parallel Processing, ICPP 2013, pp. 970–976 (2013)
Google Scholar
Wylie, B.J.N., Geimer, M., Mohr, B., Böhme, D., Szebenyi, Z., Wolf, F.: Large-scale performance analysis of Sweep3D with the Scalasca toolset. Parallel Process. Lett. 20(04), 397–414 (2010)
Article MathSciNet Google Scholar

Download references

Acknowledgments

We thank the ASC Tri-Labs and the Los Alamos National Laboratory for their friendly support. Part of this work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. (LLNL-CONF-652119). This work has been supported by the CRESTA project that has received funding from the European Community’s Seventh Framework Programme (ICT-2011.9.13) under Grant Agreement no. 287703.

Author information

Authors and Affiliations

Technische Universität Dresden, 01062, Dresden, Germany
Tobias Hilbrich, Michael Wagner & Wolfgang E. Nagel
RWTH Aachen University, 52056, Aachen, Germany
Joachim Protze & Matthias S. Müller
JARA – High-Performance Computing, 52062, Aachen, Germany
Joachim Protze & Matthias S. Müller
Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
Martin Schulz & Bronis R. de Supinski

Authors

Tobias Hilbrich
View author publications
You can also search for this author in PubMed Google Scholar
Joachim Protze
View author publications
You can also search for this author in PubMed Google Scholar
Michael Wagner
View author publications
You can also search for this author in PubMed Google Scholar
Matthias S. Müller
View author publications
You can also search for this author in PubMed Google Scholar
Martin Schulz
View author publications
You can also search for this author in PubMed Google Scholar
Bronis R. de Supinski
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang E. Nagel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Schulz .

Editor information

Editors and Affiliations

KTH Royal Institute of Technology, Stockholm, Sweden
Stefano Markidis
KTH Royal Institute of Technology, Stockholm, Sweden
Erwin Laure

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hilbrich, T. et al. (2015). Memory Usage Optimizations for Online Event Analysis. In: Markidis, S., Laure, E. (eds) Solving Software Challenges for Exascale. EASC 2014. Lecture Notes in Computer Science(), vol 8759. Springer, Cham. https://doi.org/10.1007/978-3-319-15976-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-15976-8_8
Published: 19 February 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15975-1
Online ISBN: 978-3-319-15976-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics