Advertisement

Event-Based Measurement and Analysis of One-Sided Communication

  • Marc-André Hermanns
  • Bernd Mohr
  • Felix Wolf
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3648)

Abstract

To analyze the correctness and the performance of a program, information about the dynamic behavior of all participating processes is needed. The dynamic behavior can be modeled as a stream of events required for a later analysis including appropriate attributes. Based on this idea, kojak, a trace-based toolkit for performance analysis, records and analyzes the activities of mpi-1 point-to-point and collective communication.

To support remote-memory access (rma) hardware in a portable way, mpi-2 introduced a standardized interface for remote memory access. However, potential performance gains come at the expense of more complex semantics. From a programmer’s point of view, an mpi-2 data transfer is only completed after a sequence of communication and associated synchronization calls.

This paper describes the integration of performance measurement and analysis methods for rma communication into the kojak toolkit. Special emphasis is put on the underlying event model used to represent the dynamic behavior of mpi-2 rma operations. We show that our model reflects the relationships between communication and synchronization more accurately than existing models. In addition, the model is general enough to also cover alternate but simpler rma interfaces, such as shmem and Co-Array Fortran.

Keywords

Message Passing Interface Origin Process Event Trace Message Passing Interface Performance Remote Memory Access 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI - the Complete Reference,The MPI Core, 2nd edn., vol. 1. MIT Press, Cambridge (1998)Google Scholar
  2. 2.
    Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., Snir, M.: MPI - the Complete Reference.The MPI Extensions, vol. 2. MIT Press, Cambridge (1998)Google Scholar
  3. 3.
    Mirin, A., Sawyer, W.: A scalable implementation of a finite volume dynamical core in the Community Atmosphere Model. Accepted for publication in the International Journal of High-Performance Computing ApplicationsGoogle Scholar
  4. 4.
    Mohror, K., Karavanic, K.L.: Performance Tool Support for MPI-2 on Linux. In: Proceedings of SC 2004, Pittsburgh, PA (November 2004)Google Scholar
  5. 5.
    Shende, S., Malony, A.D., Cuny, J., Lindlan, K., Beckman, P., Karmesin, S.: Portable Profiling and Tracing for Parallel Scientific Applications using C++. In: Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, August 1998, pp. 134–145. ACM, New York (1998)CrossRefGoogle Scholar
  6. 6.
    Pallas/Intel. The Intel Trace Collector (2004), http://www.intel.com/software/products/cluster/tcollector/
  7. 7.
    Wolf, F.: Automatic Performance Analysis on Parallel Computers with SMP Nodes. Dissertation, NIC Series, Vol. 17, Forschungszentrum (Jülich 2002)Google Scholar
  8. 8.
    Nagel, W., Arnold, A., Weber, M., Hoppe, H.-C., Solchenbach, K.: Vampir: Visualization and Analysis of MPI Resources. Supercomputer 12, 69–80 (1996)Google Scholar
  9. 9.
    Numrich, R.W., Reid, J.K.: Co-Array Fortran for Parallel Programming. ACM Fortran Forum 17(2) (1998)Google Scholar
  10. 10.
    Wolf, F., Mohr, B.: Automatic Performance Analysis of Hybrid MPI/OpenMP Applications. Journal of Systems Architecture 49(10–11), 421–439 (2003)CrossRefGoogle Scholar
  11. 11.
    Mohr, B., DeRose, L., Vetter, J.: A Performance Measurement Infrastructure for Co-Array Fortran. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648, pp. 146–155. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Wolf, F., Mohr, B., Dongarra, J., Moore, S.: Efficient Pattern Search in Large Traces through Successive Refinement. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 47–54. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. 13.
    Hermanns, M.A.: Event-based Performance Analysis of Remote Memory Access Operations . Diploma Thesis, Forschungszentrum Jülich (2004) (in German)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Marc-André Hermanns
    • 1
  • Bernd Mohr
    • 1
  • Felix Wolf
    • 2
  1. 1.Forschungszentrum Jülich, Zentralinstitut für Angewandte MathematikJülichGermany
  2. 2.ICLUniversity of TennesseeKnoxvilleUSA

Personalised recommendations