Advertisement

Order Preserving Event Aggregation in TBONs

  • Tobias Hilbrich
  • Matthias S. Müller
  • Martin Schulz
  • Bronis R. de Supinski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6960)

Abstract

Runtime tools for MPI applications must gather information from all processes to a tool front-end for presentation. Scalability requies that tools aggregate and reduce this information so tool developers often use a Tree Based Overlay Network (TBON). TBONs aggregate multiple associated events through a hierarchical communication structure. We present a novel algorithm to execute multiple aggregations while, at the same time, preserving relevant event orders. We implement this algorithm in our tool infrastructure that provides TBON functionality as one of its services. We demonstrate that our approach provides scalability with experiments for up to 2048 tasks.

Keywords

Order Preserve Lawrence Livermore National Laboratory Event Stream Tool Developer Multiple Aggregation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Balle, S.M., Brett, B.R., Chen, C.-P., LaFrance-Linden, D.: A New Approach to Parallel Debugger Architecture. In: Fagerholm, J., Haataja, J., Järvinen, J., Lyly, M., Råback, P., Savolainen, V. (eds.) PARA 2002. LNCS, vol. 2367, pp. 139–758. Springer, Heidelberg (2002)Google Scholar
  2. 2.
    Bongo, L.A., Anshus, O.J., Bjørndalen, J.M.: EventSpace – Exposing and Observing Communication Behavior of Parallel Cluster Applications. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 47–56. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Evensky, D.A., Gentile, A.C., Camp, L.J., Armstrong, R.C.: Lilith: Scalable Execution of User Code for Distributed Computing. In: Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing, HPDC 1997, p. 305. IEEE Computer Society, Washington, DC (1997)Google Scholar
  4. 4.
    Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed diffusion: a scalable and robust communication paradigm for sensor networks. In: Proceedings of the 6th Annual International Conference on Mobile Computing and Networking, MobiCom 2000, pp. 56–67. ACM, New York (2000)Google Scholar
  5. 5.
    Krammer, B., Müller, M.S.: MPI Application Development with MARMOT. In: Joubert, G.R., Nagel, W.E., Peters, F.J., Plata, O.G., Tirado, P., Zapata, E.L. (eds.) PARCO. John von Neumann Institute for Computing Series, vol. 33, pp. 893–900. Central Institute for Applied Mathematics, Jülich (2005)Google Scholar
  6. 6.
    Lamport, L.: Time clocks, and the ordering of events in a distributed system. Commun. ACM 21, 558–565 (1978)CrossRefzbMATHGoogle Scholar
  7. 7.
    Massie, M.L., Chun, B.N., Culler, D.E.: The Ganglia Distributed Monitoring System: Design, Implementation And Experience. Parallel Computing 30, 2004 (2003)Google Scholar
  8. 8.
    Roth, P.C., Arnold, D.C., Miller, B.P.: MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003, p. 21. ACM, New York (2003)Google Scholar
  9. 9.
    Shatdal, A., Naughton, J.F.: Adaptive parallel aggregation algorithms. SIGMOD Rec. 24, 104–114 (1995)CrossRefGoogle Scholar
  10. 10.
    Stephens, R.: A survey of stream processing. Acta Informatica 34, 491–541 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Teo, Y.M., Onggo, B.S.S., Tay, S.C.: Effect of Event Orderings on Memory Requirement in Parallel Simulation. In: Proceedings of the Ninth International Symposium in Modeling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS 2001, pp. 41–48. IEEE Computer Society, Washington, DC (2001)CrossRefGoogle Scholar
  12. 12.
    Vetter, J., de Supinski, B.: Dynamic Software Testing of MPI Applications with Umpire. In: ACM/IEEE Conference on Supercomputing, November 4-10, p. 51 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tobias Hilbrich
    • 1
  • Matthias S. Müller
    • 1
  • Martin Schulz
    • 2
  • Bronis R. de Supinski
    • 2
  1. 1.Technische Universität Dresden, ZIHDresdenGermany
  2. 2.Lawrence Livermore National LaboratoryLivermore

Personalised recommendations