Skip to main content

Mitigating MPI Message Matching Misery

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9697))

Abstract

To satisfy MPI ordering semantics in the presence of wildcards, current implementations store posted receive operations and unexpected messages in linked lists. As applications scale up, communication patterns that scale with the number of processes or the number of threads per process can cause those linked lists to grow and become a performance problem. We propose new structures and matching algorithms to address these performance challenges. Our scheme utilizes a hash map that is extended with message ordering annotations to significantly reduce time spent searching for matches in the posted receive and the unexpected message structures. At the same time, we maintain the required MPI ordering semantics, even in the presence of wildcards. We evaluate our approach on several benchmarks and demonstrate a significant reduction in the number of unsuccessful match attempts in the MPI message processing engine, while at the same time incurring low space and time overheads.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., et al.: The NAS parallel benchmarks. Intl. J. High Perform. Comput. Appl. 5(3), 63–73 (1991)

    Article  Google Scholar 

  2. Barrett, B.W., Brightwell, R., Hemmert, S., Pedretti, K., Wheeler, K., Underwood, K., Riesen, R., Maccabe, A.B., Hudson, T.: The portals 4.0.2 network programming interface. Technical report SAND2013-3181, Sandia National Laboratories, April 2013

    Google Scholar 

  3. Barrett, B.W., Hammond, S.D., Brightwell, R., Hemmert, K.S.: The impact of hybrid-core processors on mpi message rate. In: Proceedings of 20th European MPI Users’ Group Meeting, EuroMPI 2013, pp. 67–71 (2013)

    Google Scholar 

  4. Brightwell, R., Goudy, S., Underwood, K.: A preliminary analysis of the MPI queue characterisitics of several applications. In: International Conference on Parallel Processing, ICPP 2005, pp. 175–183, June 2005

    Google Scholar 

  5. Brightwell, R., Pedretti, K., Ferreira, K.: Instrumentation and analysis of MPI queue times on the SeaStar high-performance network. In: Proceedings of 17th International Conference on Computer Communications and Networks, ICCCN 2008, pp. 1–7, August 2008

    Google Scholar 

  6. Brightwell, R., Riesen, R., Maccabe, A.B.: Design, implementation, and performance of MPI on Portals 3.0. Int. J. High Perform. Comput. Appl. 17(1), 7–19 (2003)

    Article  MATH  Google Scholar 

  7. Brightwell, R., Underwood, K.D.: An analysis of NIC resource usage for offloading MPI. In: Proceedings of Workshop on Communication Architecture for Clusters (2004)

    Google Scholar 

  8. Buntinas, D., Mercier, G., Gropp, W.: Design and evaluation of nemesis, a scalable, low-latency, message-passing communication subsystem. In: Proceedings of 6th International Symposium on Cluster Computing and the Grid, CCGrid, vol. 1, pp. 521–530, May 2006

    Google Scholar 

  9. Dózsa, G., Kumar, S., Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Ratterman, J., Thakur, R.: Enabling concurrent multithreaded MPI communication on multicore petascale systems. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 11–20. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: goals, concept, and design of a next generation MPI implementation. In: Proceedings of 11th European PVM/MPI Users’ Group Meeting, pp. 97–104. Budapest, Hungary, September 2004

    Google Scholar 

  11. Intel Corporation: Intel\(^{\textregistered }\) True Scale Fabric Architecture: Enhanced HPC Architecture and Performance (2013)

    Google Scholar 

  12. Jin, H.W., Sur, S., Chai, L., Panda, D.: LiMIC: support for high-performance MPI intra-node communication on Linux cluster. In: Proceedings of International Conference on Parallel Processing, pp. 184–191, June 2005

    Google Scholar 

  13. Keller, R., Graham, R.L.: Characteristics of the unexpected message queue of MPI applications. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 179–188. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. McGrattan, K., Hostikka, S., Floyd, J.E.: Fire Dynamics Simulator, User’s Guide. NIST Special Publication 1019 (2013)

    Google Scholar 

  15. Mellanox Technologies: Mellanox HPC-X™ Software Toolkit User Manual (2014)

    Google Scholar 

  16. MPI Forum: MPI: a message-passing interface standard version 3.0. Technical report, University of Tennessee, Knoxville, October 2012

    Google Scholar 

  17. MPI Forum Point-to-Point Working Group: No wildcards proposal to the MPI Forum, April 2015. https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/461

  18. MPICH: A high performance and widely portable implementation of the MPI standard, April 2015. http://www.mpich.org

  19. Müller, M.S., van Waveren, M., Lieberman, R., Whitney, B., Saito, H., Kumaran, K., Baron, J., Brantley, W.C., Parrott, C., Elken, T., Feng, H., Ponder, C.: SPEC MPI2007-an application benchmark suite for parallel systems using MPI. Concurr. Comput.: Pract. Exp. 22(2), 191–205 (2010)

    Google Scholar 

  20. Open Fabrics Alliance: Open fabric interfaces (OFI), April 2015. http://ofiwg.github.io/libfabric/

  21. Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995)

    Article  MATH  Google Scholar 

  22. Plimpton, S., Pollock, R., Stevens, M.: Particle-Mesh Ewald and rRESPA for parallel molecular dynamics simulations. In: Proceedings of 8th SIAM Conference on Parallel Processing for Scientific Computing, PPSC (1997)

    Google Scholar 

  23. Squyres, J.: The message passing interface (MPI). In: Gavrilovska, A. (ed.) Attaining High Performance Communications: A Vertical Approach, pp. 251–280. CRC Press, Boca Raton (2009)

    Chapter  Google Scholar 

  24. Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)

    Article  Google Scholar 

  25. Underwood, K., Brightwell, R.: The impact of MPI queue usage on message latency. In: International Conference on Parallel Processing, ICPP 2004, vol. 1, pp. 152–160, August 2004

    Google Scholar 

  26. Underwood, K.D., Hemmert, K.S., Rodrigues, A., Murphy, R., Brightwell, R.: A hardware acceleration unit for MPI queue processing. In: Proceedings of 19th International Parallel and Distributed Processing Symposium 2005, p. 96b (2005)

    Google Scholar 

  27. Zounmevo, J.A., Afsahi, A.: An efficient MPI message queue mechanism for large-scale jobs. In: Proceedings of 18th International Conference on Parallel and Distributed Systems, ICPADS 2012, pp. 464–471 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Mario Flajslik or James Dinan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Flajslik, M., Dinan, J., Underwood, K.D. (2016). Mitigating MPI Message Matching Misery. In: Kunkel, J., Balaji, P., Dongarra, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9697. Springer, Cham. https://doi.org/10.1007/978-3-319-41321-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41321-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41320-4

  • Online ISBN: 978-3-319-41321-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics