Event-Based Measurement and Analysis of One-Sided Communication
To analyze the correctness and the performance of a program, information about the dynamic behavior of all participating processes is needed. The dynamic behavior can be modeled as a stream of events required for a later analysis including appropriate attributes. Based on this idea, kojak, a trace-based toolkit for performance analysis, records and analyzes the activities of mpi-1 point-to-point and collective communication.
To support remote-memory access (rma) hardware in a portable way, mpi-2 introduced a standardized interface for remote memory access. However, potential performance gains come at the expense of more complex semantics. From a programmer’s point of view, an mpi-2 data transfer is only completed after a sequence of communication and associated synchronization calls.
This paper describes the integration of performance measurement and analysis methods for rma communication into the kojak toolkit. Special emphasis is put on the underlying event model used to represent the dynamic behavior of mpi-2 rma operations. We show that our model reflects the relationships between communication and synchronization more accurately than existing models. In addition, the model is general enough to also cover alternate but simpler rma interfaces, such as shmem and Co-Array Fortran.
KeywordsMessage Passing Interface Origin Process Event Trace Message Passing Interface Performance Remote Memory Access
Unable to display preview. Download preview PDF.
- 1.Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI - the Complete Reference,The MPI Core, 2nd edn., vol. 1. MIT Press, Cambridge (1998)Google Scholar
- 2.Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., Snir, M.: MPI - the Complete Reference.The MPI Extensions, vol. 2. MIT Press, Cambridge (1998)Google Scholar
- 3.Mirin, A., Sawyer, W.: A scalable implementation of a finite volume dynamical core in the Community Atmosphere Model. Accepted for publication in the International Journal of High-Performance Computing ApplicationsGoogle Scholar
- 4.Mohror, K., Karavanic, K.L.: Performance Tool Support for MPI-2 on Linux. In: Proceedings of SC 2004, Pittsburgh, PA (November 2004)Google Scholar
- 5.Shende, S., Malony, A.D., Cuny, J., Lindlan, K., Beckman, P., Karmesin, S.: Portable Profiling and Tracing for Parallel Scientific Applications using C++. In: Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, August 1998, pp. 134–145. ACM, New York (1998)CrossRefGoogle Scholar
- 6.Pallas/Intel. The Intel Trace Collector (2004), http://www.intel.com/software/products/cluster/tcollector/
- 7.Wolf, F.: Automatic Performance Analysis on Parallel Computers with SMP Nodes. Dissertation, NIC Series, Vol. 17, Forschungszentrum (Jülich 2002)Google Scholar
- 8.Nagel, W., Arnold, A., Weber, M., Hoppe, H.-C., Solchenbach, K.: Vampir: Visualization and Analysis of MPI Resources. Supercomputer 12, 69–80 (1996)Google Scholar
- 9.Numrich, R.W., Reid, J.K.: Co-Array Fortran for Parallel Programming. ACM Fortran Forum 17(2) (1998)Google Scholar
- 13.Hermanns, M.A.: Event-based Performance Analysis of Remote Memory Access Operations . Diploma Thesis, Forschungszentrum Jülich (2004) (in German)Google Scholar