Benchmarking MPI is a contentious subject at best. Micro- benchmarks are used because they are easy to port and, hypothetically, measure an important system characteristic in isolation. The unfortunate reality is that it is remarkably difficult to create a benchmark that is a fair measurement in the context of modern system. Software optimizations and modern processor architecture perform extremely efficiently on benchmarks, where it would not in an application context. This paper explores the challenges faced when benchmarking the network in a modern microprocessor climate and the remarkable impacts on the results that are obtained.


Software Optimization Compiler Optimization System Level Optimization Message Latency Effective Latency 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
  3. 3.
    Alexandrov, A., Ionescu, M.F., Schauser, K.E., Sheiman, C.: LogGP: Incorporating long messages into the LogP model. Journal of Parallel and Distributed Computing 44(1), 71–79 (1997)CrossRefGoogle Scholar
  4. 4.
    Bell, C., Bonachea, D., Cote, Y., Duell, J., Hargrove, P., Husbands, P., Iancu, C., Welcome, M., Yelick, K.: An evaluation of current high-performance networks. In: 17th International Parallel and Distributed Processing Symposium (IPDPS 2003) (April 2003)Google Scholar
  5. 5.
    Culler, D.E., Karp, R.M., Patterson, D.A., Sahay, A., Schauser, K.E., Santos, E., Subramonian, R., von Eicken, T.: LogP: Towards a realistic model of parallel computation. In: Proceedings 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 1–12 (1993)Google Scholar
  6. 6.
    Lawry, W., Wilson, C., Maccabe, A.B., Brightwell, R.: COMB: A portable benchmark suite for assessing MPI overlap. In: IEEE International Conference on Cluster Computing, Poster paper (September 2002)Google Scholar
  7. 7.
    Liu, J., Chandrasekaran, B., Wu, J., Jiang, W., Kini, S., Yu, W., Buntinas, D., Wyckoff, P., Panda, D.K.: Performance comparison of MPI implementations over InfiniBand, Myrinet and Quadrics. In: The International Conference for High Performance Computing and Communications (SC 2003) (November 2003)Google Scholar
  8. 8.
    Martin, R.P., Vahdat, A.M., Culler, D.E., Anderson, T.E.: Effects of communication latency, overhead, and bandwidth in a cluster architecture. In: Proceedings of the 24th Annual International Symposium on Computer Architecture (June 1997)Google Scholar
  9. 9.
    Petrini, F., Kerbyson, D.J., Pakin, S.: The case of the missing supercomputer performance: Identifying and eliminating the performance variability on the ASCI Q machine. In: Proceedings of the 2003 Conference on High Performance Networking and Computing (November 2003)Google Scholar
  10. 10.
    Riesen, R., Brightwell, R., Maccabe, A.B.: Measuring MPI latency variance. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) EuroPVM/MPI 2003. LNCS, vol. 2840, pp. 112–116. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  11. 11.
    Snell, Q.O., Mikler, A., Gustafson, J.L.: NetPIPE: A network protocol independent performance evaluator. In: Proceedings of the IASTED International Conference on Intelligent Information Management and Systems (June 1996)Google Scholar
  12. 12.
    Underwood, K.D.: The impacts of message rate on applications programming (submitted)Google Scholar
  13. 13.
    Underwood, K.D., Brightwell, R.: The impact of MPI queue usage on message latency. In: Proceedings of the International Conference on Parallel Processing (ICPP), Montreal, Canada (August 2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Keith D. Underwood
    • 1
  1. 1.Sandia National LaboratriesAlbuquerque

Personalised recommendations