Order Preserving Event Aggregation in TBONs

Hilbrich, Tobias; Müller, Matthias S.; Schulz, Martin; de Supinski, Bronis R.

doi:10.1007/978-3-642-24449-0_5

Tobias Hilbrich¹⁹,
Matthias S. Müller¹⁹,
Martin Schulz²⁰ &
…
Bronis R. de Supinski²⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6960))

Included in the following conference series:

European MPI Users' Group Meeting

1085 Accesses
6 Citations

Abstract

Runtime tools for MPI applications must gather information from all processes to a tool front-end for presentation. Scalability requies that tools aggregate and reduce this information so tool developers often use a Tree Based Overlay Network (TBON). TBONs aggregate multiple associated events through a hierarchical communication structure. We present a novel algorithm to execute multiple aggregations while, at the same time, preserving relevant event orders. We implement this algorithm in our tool infrastructure that provides TBON functionality as one of its services. We demonstrate that our approach provides scalability with experiments for up to 2048 tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Balle, S.M., Brett, B.R., Chen, C.-P., LaFrance-Linden, D.: A New Approach to Parallel Debugger Architecture. In: Fagerholm, J., Haataja, J., Järvinen, J., Lyly, M., Råback, P., Savolainen, V. (eds.) PARA 2002. LNCS, vol. 2367, pp. 139–758. Springer, Heidelberg (2002)
Google Scholar
Bongo, L.A., Anshus, O.J., Bjørndalen, J.M.: EventSpace – Exposing and Observing Communication Behavior of Parallel Cluster Applications. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 47–56. Springer, Heidelberg (2003)
Chapter Google Scholar
Evensky, D.A., Gentile, A.C., Camp, L.J., Armstrong, R.C.: Lilith: Scalable Execution of User Code for Distributed Computing. In: Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing, HPDC 1997, p. 305. IEEE Computer Society, Washington, DC (1997)
Google Scholar
Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed diffusion: a scalable and robust communication paradigm for sensor networks. In: Proceedings of the 6th Annual International Conference on Mobile Computing and Networking, MobiCom 2000, pp. 56–67. ACM, New York (2000)
Google Scholar
Krammer, B., Müller, M.S.: MPI Application Development with MARMOT. In: Joubert, G.R., Nagel, W.E., Peters, F.J., Plata, O.G., Tirado, P., Zapata, E.L. (eds.) PARCO. John von Neumann Institute for Computing Series, vol. 33, pp. 893–900. Central Institute for Applied Mathematics, Jülich (2005)
Google Scholar
Lamport, L.: Time clocks, and the ordering of events in a distributed system. Commun. ACM 21, 558–565 (1978)
Article MATH Google Scholar
Massie, M.L., Chun, B.N., Culler, D.E.: The Ganglia Distributed Monitoring System: Design, Implementation And Experience. Parallel Computing 30, 2004 (2003)
Google Scholar
Roth, P.C., Arnold, D.C., Miller, B.P.: MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003, p. 21. ACM, New York (2003)
Google Scholar
Shatdal, A., Naughton, J.F.: Adaptive parallel aggregation algorithms. SIGMOD Rec. 24, 104–114 (1995)
Article Google Scholar
Stephens, R.: A survey of stream processing. Acta Informatica 34, 491–541 (1997)
Article MathSciNet MATH Google Scholar
Teo, Y.M., Onggo, B.S.S., Tay, S.C.: Effect of Event Orderings on Memory Requirement in Parallel Simulation. In: Proceedings of the Ninth International Symposium in Modeling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS 2001, pp. 41–48. IEEE Computer Society, Washington, DC (2001)
Chapter Google Scholar
Vetter, J., de Supinski, B.: Dynamic Software Testing of MPI Applications with Umpire. In: ACM/IEEE Conference on Supercomputing, November 4-10, p. 51 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität Dresden, ZIH, D-01062, Dresden, Germany
Tobias Hilbrich & Matthias S. Müller
Lawrence Livermore National Laboratory, Livermore, CA, 94551
Martin Schulz & Bronis R. de Supinski

Authors

Tobias Hilbrich
View author publications
You can also search for this author in PubMed Google Scholar
Matthias S. Müller
View author publications
You can also search for this author in PubMed Google Scholar
Martin Schulz
View author publications
You can also search for this author in PubMed Google Scholar
Bronis R. de Supinski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics and Telecommunications, University of Athens, 15784, Athens, Greece
Yiannis Cotronis
University of Tennessee, 1122 Volunteer Blvd, 37996-3450, Knoxville, TN, USA
Anthony Danalis & Jack Dongarra &
University of Crete, Heraklion, Greece
Dimitrios S. Nikolopoulos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hilbrich, T., Müller, M.S., Schulz, M., de Supinski, B.R. (2011). Order Preserving Event Aggregation in TBONs. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds) Recent Advances in the Message Passing Interface. EuroMPI 2011. Lecture Notes in Computer Science, vol 6960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24449-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-24449-0_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24448-3
Online ISBN: 978-3-642-24449-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics