Tolerating Message Latency Through the Early Release of Blocked Receives

Ke, Jian; Burtscher, Martin; Speight, Evan

doi:10.1007/11549468_6

Jian Ke¹⁸,
Martin Burtscher¹⁸ &
Evan Speight¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3648))

Included in the following conference series:

European Conference on Parallel Processing

977 Accesses
4 Citations

Abstract

Large message latencies often lead to poor performance of parallel applications. In this paper, we investigate a latency-tolerating technique that immediately releases all blocking receives, even when the message has not yet (completely) arrived, and enforces execution correctness through page protection. This approach eliminates false message data dependencies on incoming messages and allows the computation to proceed as early as possible. We implement and evaluate our early-release technique in the context of an MPI runtime library. The results show that the execution speed of MPI applications improves by up to 60% when early release is enabled. Our approach also enables faster and easier parallel programming as it frees programmers from adopting more complex nonblocking receives and from tuning message sizes to explicitly reduce false message data dependencies.

Download to read the full chapter text

Chapter PDF

heimdallr: Improving Compile Time Correctness Checking for Message Passing with Rust

Debugging Latent Synchronization Errors in MPI-3 One-Sided Communication

MPI Runtime Error Detection with MUST: Advanced Error Reports

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Demaine, E.D.: A Threads-Only MPI Implementation for the Development of Parallel Programs. In: Intl. Symp. on High Perf. Comp. Systems, vol. 7, pp. 153–163 (1997)
Google Scholar
Dunning, D., Regnier, G., McApline, G., Cameron, D., Shubert, B., Berry, F., Merritt, A., Gronke, E., Dodd, C.: The Virtual Interface Architecture. IEEE Micro 3, 66–76 (1998)
Article Google Scholar
http://www.tc.cornell.edu/
Infiniband Trade Association, Infiniband Architecture Specification, Release 1.0 (October 2000)
Google Scholar
Karwande, A., Yuan, X., Lowenthal, D.K.: CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters. In: The Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, vol. 6, pp. 95–106 (2003)
Google Scholar
Ke, J.: Adapting parallel program execution in cluster computers through thread migration. M.S. Thesis, Cornell University (2003)
Google Scholar
Ke, J., Burtscher, M., Speight, E.: Runtime Compression of MPI Messages to Improve the Performance and Scalability of Parallel Applications. In: Supercomputing,vol. 11 (2004)
Google Scholar
Ke, J., Burtscher, M., Speight, E.: Reducing Communication Time through Message Prefetching. In: Intl. Conf. on Parallel and Distributed Processing Techniques and Applications, vol. 6 (2005)
Google Scholar
Liu, J., Wu, J., Kini, S.P., Wyckoff, P., Panda, D.K.: High Performance RDMA-Based MPI Implementation over InfiniBand. In: Intl. Conf. on Supercomputing, vol. 6, pp. 295–304 (2003)
Google Scholar
Forum, M.P.I.: MPI: A Message-Passing Interface Standard. The Intl. J. of Supercomputer Applications and High Performance Computing 8(3/4), 165–414 (1994)
Google Scholar
Speight, E., Abdel-Shafi, H., Bennett, J.K.: Realizing the Performance Potential of the Virtual Interface Architecture. In: Intl. Conf. on Supercomputing, vol. 6, pp. 184–192 (1999)
Google Scholar
Tang, H., Yang, T.: Optimizing Threaded MPI Execution on SMP Clusters. In: Intl. Conf. on Supercomputing, vol. 6, pp. 381–392 (2001)
Google Scholar
Thakur, R., Gropp, W.: Improving the Performance of Collective Operations in MPICH. In: European PVM/MPI Users’ Group Conference, vol. 9, pp. 257–267 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Systems Laboratory, School of Electrical & Computer Engineering, Cornell University, Ithaca, NY, 14853, USA
Jian Ke & Martin Burtscher
Novel System Architectures, IBM Austin Research Lab, Austin, TX, 78758, USA
Evan Speight

Authors

Jian Ke
View author publications
You can also search for this author in PubMed Google Scholar
Martin Burtscher
View author publications
You can also search for this author in PubMed Google Scholar
Evan Speight
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Topic Chairs,
José C. Cunha
Faculdade de Ciências e Technologia CITI Centre, Quinta da Torre, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal
Pedro D. Medeiros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ke, J., Burtscher, M., Speight, E. (2005). Tolerating Message Latency Through the Early Release of Blocked Receives. In: Cunha, J.C., Medeiros, P.D. (eds) Euro-Par 2005 Parallel Processing. Euro-Par 2005. Lecture Notes in Computer Science, vol 3648. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11549468_6

Download citation

DOI: https://doi.org/10.1007/11549468_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28700-1
Online ISBN: 978-3-540-31925-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Tolerating Message Latency Through the Early Release of Blocked Receives

Abstract

Chapter PDF

Similar content being viewed by others

heimdallr: Improving Compile Time Correctness Checking for Message Passing with Rust

Debugging Latent Synchronization Errors in MPI-3 One-Sided Communication

MPI Runtime Error Detection with MUST: Advanced Error Reports

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Tolerating Message Latency Through the Early Release of Blocked Receives

Abstract

Chapter PDF

Similar content being viewed by others

heimdallr: Improving Compile Time Correctness Checking for Message Passing with Rust

Debugging Latent Synchronization Errors in MPI-3 One-Sided Communication

MPI Runtime Error Detection with MUST: Advanced Error Reports

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation