Performance Simulation of Non-blocking Communication in Message-Passing Applications

  • David Böhme
  • Marc-André Hermanns
  • Markus Geimer
  • Felix Wolf
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6043)

Abstract

In our previous work [1], we introduced performance simulation as an instrument to verify hypotheses on causality between locally and spatially distant performance phenomena without altering the application itself. This is accomplished by modifying mpi event traces and using them to simulate hypothetical message-passing behavior. Here, we present enhancements to our approach, which was previously restricted to blocking communication, that now allow us to correctly simulate mpi non-blocking communication. We enhanced the underlying trace data format to record communication requests, and extended the simulator to even retain the inherently non-deterministic behavior of operations such as MPI_Waitany.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hermanns, M.A., Geimer, M., Wolf, F., Wylie, B.J.N.: Verifying causality between distant performance phenomena in large-scale mpi applications. In: Proceedings of the 17th International Conference on Parallel, Distributed, and Network-Based Processing (February 2009)Google Scholar
  2. 2.
    Yan, J., Sarukkai, S., Mehra, P.: Performance Measurement, Visualization and Modeling of Parallel and Distributed Programs using the AIMS Toolkit. Software – Practice and Experience 25(4), 429–461 (1995)CrossRefGoogle Scholar
  3. 3.
    Rodriguez, G., Badia, R.M., Labarta, J.: Generation of simple analytical models for message passing applications. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 183–188. Springer, Heidelberg (2004)Google Scholar
  4. 4.
    Zheng, G., Wilmarth, T., Jagadishprasad, P., Kalé, L.V.: Simulation-based performance prediction for large parallel machines. International Journal of Parallel Programming 33(2-3) (2005)Google Scholar
  5. 5.
    Geimer, M., Wolf, F., Wylie, B.J.N., Mohr, B.: Scalable parallel trace-based performance analysis. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 303–312. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Geimer, M., Wolf, F., Knüpfer, A., Mohr, B., Wylie, B.J.N.: A parallel trace-data interface for scalable performance analysis. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 398–408. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Bailey, D.H., Barzcz, E., Dagum, L., Simon, H.D.: NAS parallel benchmark results. IEEE Parallel Distrib. Technol. 1(1), 43–51 (1993)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • David Böhme
    • 1
    • 2
  • Marc-André Hermanns
    • 1
  • Markus Geimer
    • 1
  • Felix Wolf
    • 1
    • 2
  1. 1.Jülich Supercomputing Centre, Forschungszentrum JülichGermany
  2. 2.Aachen Institute for Advanced Study in Computational Engineering ScienceRWTH Aachen UniversityGermany

Personalised recommendations