Skip to main content

Improving MPI Communication Overlap with Collaborative Polling

  • Conference paper
Recent Advances in the Message Passing Interface (EuroMPI 2012)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7490))

Included in the following conference series:

Abstract

With the rise of parallel applications complexity, the needs in term of computational power are continually growing. Recent trends in High-Performance Computing (HPC) have shown that improvements in single-core performance will not be sufficient to face the challenges of an Exascale machine: we expect an enormous growth of the number of cores as well as a multiplication of the data volume exchanged across compute nodes. To scale applications up to Exascale, the communication layer has to minimize the time while waiting for network messages. This paper presents a message progression based on Collaborative Polling which allows an efficient auto-adaptive overlapping of communication phases by performing computing. This approach is new as it increases the application overlap potential without introducing overheads of a threaded message progression.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Iii, J.B.W., Bova, S.W.: Where’s the overlap? - an analysis of popular MPI implementations. Technical report (August 12, 1999)

    Google Scholar 

  2. Brightwell, R., Riesen, R., Underwood, K.D.: Analyzing the impact of overlap, offload, and independent progress for message passing interface applications. IJHPCA (2005)

    Google Scholar 

  3. Pérache, M., Carribault, P., Jourdren, H.: MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) PVM/MPI. LNCS, vol. 5759, pp. 94–103. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Bell, C., Bonachea, D., Nishtala, R., Yelick, K.A.: Optimizing bandwidth limited problems using one-sided communication and overlap. In: IPDPS (2006)

    Google Scholar 

  5. Subotic, V., Sancho, J.C., Labarta, J., Valero, M.: The impact of application’s micro-imbalance on the communication-computation overlap. In: Parallel, Distributed and Network-based Processing (PDP) (2011)

    Google Scholar 

  6. Thakur, R., Gropp, W.: Open Issues in MPI Implementation. In: Choi, L., Paek, Y., Cho, S. (eds.) ACSAC 2007. LNCS, vol. 4697, pp. 327–338. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  7. Hager, G., Jost, G., Rabenseifner, R.: Communication characteristics and hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: Proceedings of Cray User Group (2009)

    Google Scholar 

  8. Graham, R., Poole, S., Shamis, P., Bloch, G., Bloch, N., Chapman, H., Kagan, M., Shahar, A., Rabinovitz, I., Shainer, G.: Connectx-2 infiniband management queues: First investigation of the new support for network offloaded collective operations. In: International Conference on Cluster, Cloud and Grid Computing, CCGRID (2010)

    Google Scholar 

  9. Almási, G., Bellofatto, R., Brunheroto, J., Caşcaval, C., Castaños, J.G., Crumley, P., Erway, C.C., Lieber, D., Martorell, X., Moreira, J.E., Sahoo, R., Sanomiya, A., Ceze, L., Strauss, K.: An overview of the bluegene/L system software organization. In: Parallel Processing Letters (2003)

    Google Scholar 

  10. Amerson, G., Apon, A.: Implementation and design analysis of a network messaging module using virtual interface architecture. In: International Conference on Cluster Computing (2004)

    Google Scholar 

  11. Sur, S., Jin, H.W., Chai, L., Panda, D.K.: RDMA Read Based Rendezvous Protocol for MPI over InfiniBand: Design Alternatives and Benefits. Alternatives (2006)

    Google Scholar 

  12. Kumar, R., Mamidala, A.R., Koop, M.J., Santhanaraman, G., Panda, D.K.: Lock-Free Asynchronous Rendezvous Design for MPI Point-to-Point Communication. In: Lastovetsky, A., Kechadi, T., Dongarra, J. (eds.) EuroPVM/MPI 2008. LNCS, vol. 5205, pp. 185–193. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  13. Hoefler, T., Lumsdaine, A.: Message progression in parallel computing – to thread or not to thread? In: International Conference on Cluster Computing (2008)

    Google Scholar 

  14. Trahay, F., Denis, A.: A scalable and generic task scheduling system for communication libraries. In: International Conference on Cluster Computing (2009)

    Google Scholar 

  15. Huang, C., Lawlor, O., Kalé, L.V.: Adaptive MPI. In: LCPC (2004)

    Google Scholar 

  16. Rico-Gallego, J.-A., Díaz-Martín, J.-C.: Performance Evaluation of Thread-Based MPI in Shared Memory. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 337–338. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  17. Demaine, E.: A threads-only MPI implementation for the development of parallel programming. In: Proceedings of the 11th International Symposium on High Performance Computing Systems (1997)

    Google Scholar 

  18. Tang, H., Yang, T.: Optimizing threaded MPI execution on SMP clusters. In: International Conference on Supercomputing, ICS (2001)

    Google Scholar 

  19. Brightwell, R., Pedretti, K.: An intra-node implementation of openshmem using virtual address space mapping. In: Fifth Partitioned Global Address Space Conference (2011)

    Google Scholar 

  20. Wolff, M., Jaouen, S., Jourdren, H., Sonnendrcker, E.: High-order dimensionally split lagrange-remap schemes for ideal magnetohydrodynamics. Discrete and Continuous Dynamical Systems - Series S (2012)

    Google Scholar 

  21. Bailey, D., Harris, T., Saphir, W., van der Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0 (1995)

    Google Scholar 

  22. Springel, V.: The cosmological simulation code gadget-2. Monthly Notices of the Royal Astronomical Society 364 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Didelot, S., Carribault, P., Pérache, M., Jalby, W. (2012). Improving MPI Communication Overlap with Collaborative Polling. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds) Recent Advances in the Message Passing Interface. EuroMPI 2012. Lecture Notes in Computer Science, vol 7490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33518-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33518-1_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33517-4

  • Online ISBN: 978-3-642-33518-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics