Self-consistent MPI Performance Requirements

  • Jesper Larsson Träff
  • William Gropp
  • Rajeev Thakur
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4757)


The MPI Standard does not make any performance guarantees, but users expect (and like) MPI implementations to deliver good performance. A common-sense expectation of performance is that an MPI function should perform no worse than a combination of other MPI functions that can implement the same functionality. In this paper, we formulate some performance requirements and conditions that good MPI implementations can be expected to fulfill by relating aspects of the MPI standard to each other. Such a performance formulation could be used by benchmarks and tools, such as SKaMPI and Perfbase, to automatically verify whether a given MPI implementation fulfills basic performance requirements. We present examples where some of these requirements are not satisfied, demonstrating that there remains room for improvement in MPI implementations.


Message Passing Interface Collective Communication Collective Operation Parallel Virtual Machine Random Communicator 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Augustin, W., Worsch, T.: Usefulness and usage of SKaMPI-bench. In: Dongarra, J.J., Laforenza, D., Orlando, S. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 2840, pp. 63–70. Springer, Heidelberg (2003)Google Scholar
  2. 2.
    Barnett, M., Gupta, S., Payne, D.G., Schuler, L., van de Geijn, R., Watts, J.: Building a high-performance collective communication library. In: Supercomputing 1994, pp. 107–116 (1994)Google Scholar
  3. 3.
    Culler, D.E., Karp, R.M., Patterson, D., Sahay, A., Santos, E.E., Schauser, K.E., Subramonian, R., von Eicken, T.: LogP: A practical model of parallel computation. Communications of the ACM 39(11), 78–85 (1996)CrossRefGoogle Scholar
  4. 4.
    Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., Snir, M.: MPI – The Complete Reference. The MPI Extensions, vol. 2. MIT Press, Cambridge (1998)Google Scholar
  5. 5.
    McInnes, L.C., Ray, J., Armstrong, R., Dahlgren, T.L., Malony, A., Norris, B., Shende, S., Kenny, J.P., Steensland, J.: Computational quality of service for scientific CCA applications: Composition, substitution, and reconfiguration. Technical Report ANL/MCS-P1326-0206, Argonne National Laboratory (February 2006)Google Scholar
  6. 6.
    Norris, B., McInnes, L., Veljkovic, I.: Computational quality of service in parallel CFD. In: Proceedings of the 17th International Conference on Parallel Computational Fluid Dynamics, University of Maryland, College Park, MD, May 24–27 (to appear, 2006)Google Scholar
  7. 7.
    Reussner, R., Sanders, P., Prechelt, L., Müller, M.: SKaMPI: A detailed, accurate MPI benchmark. In: Alexandrov, V.N., Dongarra, J.J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 1497, pp. 52–59. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  8. 8.
    Reussner, R., Sanders, P., Träff, J.L.: SKaMPI: A comprehensive benchmark for public benchmarking of MPI. Scientific Programming 10(1), 55–65 (2002)Google Scholar
  9. 9.
    Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI – The Complete Reference. In: The MPI Core, 2nd edn., vol. 1, MIT Press, Cambridge (1998)Google Scholar
  10. 10.
    Thakur, R., Gropp, W.D., Rabenseifner, R.: Improving the performance of collective operations in MPICH. International Journal on High Performance Computing Applications 19, 49–66 (2004)CrossRefGoogle Scholar
  11. 11.
    Träff, J.L.: Hierarchical gather/scatter algorithms with graceful degradation. In: International Parallel and Distributed Processing Symposium (IPDPS 2004), p. 80 (2004)Google Scholar
  12. 12.
    Träff, J.L.: Efficient allgather for regular SMP-clusters. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, pp. 58–65. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Valiant, L.G.: A bridging model for parallel computation. Communications of the ACM 33(8), 103–111 (1990)CrossRefGoogle Scholar
  14. 14.
    Worringen, J.: Experiment management and analysis with perfbase. In: IEEE International Conference on Cluster Computing (2005)Google Scholar
  15. 15.
    Worringen, J.: Automated performance comparison. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, pp. 402–403. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jesper Larsson Träff
    • 1
  • William Gropp
    • 2
  • Rajeev Thakur
    • 2
  1. 1.NEC Laboratories Europe, NEC Europe Ltd., Rathausallee 10, D-53757 Sankt AugustinGermany
  2. 2.Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439USA

Personalised recommendations