Verifying Collective MPI Calls

  • Jesper Larsson Träff
  • Joachim Worringen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3241)

Abstract

The collective communication operations of MPI, and in general MPI operations with non-local semantics, require the processes participating in the calls to provide consistent parameters, eg. a unique root process, matching type signatures and amounts for data to be exchanged, or same operator. Under normal use of MPI such exhaustive consistency checks are typically too expensive to perform and would compromise optimizations for high performance in the collective routines. However, confusing and hard-to-find errors (deadlocks, wrong results, or program crash) can happen by inconsistent calls to collective operations.

We suggest to use the MPI profiling interface to provide for more extensive semantic checking of calls to MPI routines with collective (non-local) semantics. With this, exhaustive semantic checks can be enabled during application development, and disabled for production runs. We discuss what can reasonably be checked by such an interface, and mention some inherent limitations of MPI to making a fully portable interface for semantic checking. The proposed collective semantics verification interface for the full MPI-2 standard has been implemented for the NEC proprietary MPI/SX as well as other NEC MPI implementations.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gołebiewski, M., Ritzdorf, H., Träff, J.L., Zimmermann, F.: The MPI/SX implementation of MPI for NEC’s SX-6 and other NEC platforms. NEC Research & Development 44(1), 69–74 (2003)Google Scholar
  2. 2.
    Gropp, W.: Runtime checking of datatype signatures in MPI. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. In: Dongarra, J., Kacsuk, P., Podhorszki, N. (eds.) PVM/MPI 2000. LNCS, vol. 1908, pp. 160–167. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  3. 3.
    Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., Snir, M.: MPI – The Complete Reference. The MPI Extensions, vol. 2. MIT Press, Cambridge (1998)Google Scholar
  4. 4.
    Krammer, B., Bidmon, K., Müller, M.S., Resch, M.M.: MARMOT: An MPI analysis and checking tool. In: Parallel Computing, ParCo (2003)Google Scholar
  5. 5.
    Luecke, G.R., Chen, H., Coyle, J., Hoekstra, J., Kraeva, M., Zou, Y.: MPICHECK: a tool for checking fortran 90 MPI programs. Concurrency and Computation: Practice and Experience 15(2), 93–100 (2003)MATHCrossRefGoogle Scholar
  6. 6.
    Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI – The Complete Reference, 2nd edn. The MPI Core, vol. 1. MIT Press, Cambridge (1998)Google Scholar
  7. 7.
    Träff, J.L.: SMP-aware message passing programming. In: Eigth International Workshop on High-level Parallel Programming Models and Supportive Environments (HIPS03), International Parallel and Distributed Processing Symposium (IPDPS 2003), pp. 56–65 (2003)Google Scholar
  8. 8.
    Träff, J.L., Ritzdorf, H., Hempel, R.: The implementation of MPI-2 one-sided communication for the NEC SX-5. In: Supercomputing (2000), http://www.sc2000.org/proceedings/techpapr/index.htm#01
  9. 9.
    Vetter, J.S., de Supinski, B.R.: Dynamic software testing of MPI applications with Umpire. In: Supercomputing, SC (2000), http://www.sc2000.org/proceedings/techpapr/index.htm#01
  10. 10.
    Worringen, J., Träff, J.L., Ritzdorf, H.: Fast parallel non-contiguous file access. In: Supercomputing (2003), http://www.sc-conference.org/sc2003/tech papers.phpGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Jesper Larsson Träff
    • 1
  • Joachim Worringen
    • 1
  1. 1.C&C Research LaboratoriesNEC Europe LtdSankt AugustinGermany

Personalised recommendations