Wait-Free Message Passing Protocol for Non-coherent Shared Memory Architectures

Comprés Ureña, Isaías A.; Gerndt, Michael; Trinitis, Carsten

doi:10.1007/978-3-642-33518-1_19

Isaías A. Comprés Ureña¹⁹,
Michael Gerndt¹⁹ &
Carsten Trinitis¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7490))

Included in the following conference series:

European MPI Users' Group Meeting

1427 Accesses

Abstract

The number of cores in future CPUs is expected to increase steadily. Balanced CPU designs scale hardware cache coherency functionality according to the number of cores, in order to minimize bottlenecks in parallel applications. An alternative approach is to do away with hardware coherence entirely; the Single-chip Cloud Computer (SCC), a 48 core experimental processor from Intel labs, does exactly that. A wait-free protocol for message passing on non-coherent buffers was introduced with the RCKMPI library, in order to support MPI on the SCC. In this work, the message passing performance of the protocol is modeled. Additionally, a port for symmetric multi-processors is introduced and used for comparison with MPICH2-Nemesis and Open MPI. Performance is analyzed based on statistics collected on a 4-dimensional space composed of source rank, target rank, message size and frequency.

The communication protocol presented here was developed in cooperation with Intel Labs Braunschweig for the RCKMPI library. RCKMPI is provided by Intel under an open-source license at the MARC community [1].

Support for this work was provided by the Transregional Collaborative Research Centre 89: Invasive Computing (InvasIC) [7].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Intel’s Many-core Applications Research Community, http://communities.intel.com/community/marc
KNEM: High-Performance Intra-Node MPI Communication, http://runtime.bordeaux.inria.fr/knem/
Leibniz-Rechenzentrum (LRZ): SuperMUC Petascale System, http://www.lrz.de/services/compute/supermuc/systemdescription/
MPICH2, http://www.mcs.anl.gov/research/projects/mpich2/
Ohio State University (OSU) Micro-Benchmarks, http://mvapich.cse.ohio-state.edu/benchmarks/
Open MPI, http://www.open-mpi.org/
Transregional Research Center InvasIC, http://www.invasic.de
Buntinas, D., Goglin, B., Goodell, D., Mercier, G., Moreaud, S.: Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis. In: Parallel Processing, ICPP 2009 (2009)
Google Scholar
Chapman, K., Hussein, A., Hosking, A.L.: X10 on the Single-chip Cloud Computer: Porting and Preliminary Performance. In: Proceedings of the ACM SIGPLAN X10 Workshop (2011)
Google Scholar
Christgau, S., Kiertscher, S., Schnor, B.: The Benefit of Topology Awareness of MPI Applications on the SCC. In: 3rd Many-core Applications Research Community (MARC) Symposium (2011)
Google Scholar
Clauss, C., Lankes, S., Bemmerl, T.: Performance Tuning of SCC-MPICH by Means of the Proposed MPI-3.0 Tool Interface. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 318–320. Springer, Heidelberg (2011)
Chapter Google Scholar
Clauss, C., Lankes, S., Reble, P., Bemmerl, T.: Recent Advances and Future Prospects in iRCCE and SCC-MPICH. In: 3rd Many-core Applications Research Community (MARC) Symposium (2011)
Google Scholar
Comprés Ureña, I.A., Gerndt, M.: Improved RCKMPI’s SCCMPB Channel: Scaling and Dynamic Processes Support. In: 4th Many-core Applications Research Community (MARC) Symposium (2011)
Google Scholar
Comprés Ureña, I.A., Riepen, M., Konow, M.: RCKMPI – Lightweight MPI Implementation for Intel’s Single-chip Cloud Computer (SCC). In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 208–217. Springer, Heidelberg (2011)
Chapter Google Scholar
Comprés Ureña, I.A., Riepen, M., Konow, M., Gerndt, M.: Invasive MPI on intel’s single-chip cloud computer. In: Proceedings of the 25th International Conference on Architecture of Computing Systems (2012)
Google Scholar
Fuerlinger, K., Wright, N.J., Skinner, D.: Effective Performance Measurement at Petascale Using IPM. In: International Conference on Parallel and Distributed Systems, ICPADS (2010)
Google Scholar
Held, J.: Single-chip Cloud Computer, an IA Tera-scale Research Processor. In: Guarracino, M.R., Vivien, F., Träff, J.L., Cannatoro, M., Danelutto, M., Hast, A., Perla, F., Knüpfer, A., Di Martino, B., Alexander, M. (eds.) Euro-Par-Workshop 2010. LNCS, vol. 6586, p. 85. Springer, Heidelberg (2011)
Google Scholar
Mattson, T.G., Riepen, M., Lehnig, T., Brett, P., Haas, W., Kennedy, P., Howard, J., Vangal, S., Borkar, N., Ruhl, G., Dighe, S.: The 48-core SCC Processor: the Programmer’s View. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2010)
Google Scholar
Rotta, R.: On Efficient Message Passing on the Intel SCC. In: 3rd Many-core Applications Research Community (MARC) Symposium (2011)
Google Scholar
Wong, F.C., Martin, R.P., Arpaci-Dusseau, R.H., Culler, D.E.: Architectural requirements and scalability of the nas parallel benchmarks. In: Proceedings of the Conference on Supercomputing (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatics, Technical University of Munich (TUM), Boltzmannstr. 3, 85748, Garching, Germany
Isaías A. Comprés Ureña, Michael Gerndt & Carsten Trinitis

Authors

Isaías A. Comprés Ureña
View author publications
You can also search for this author in PubMed Google Scholar
Michael Gerndt
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Trinitis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Institute of Information Systems, Research Group Parallel Computing, Vienna University of Technology / TU Wien, Favoritenstrasse 16, 1040, Vienna / Wien, Austria
Jesper Larsson Träff
Faculty of Computer Science, Research Group Scientific Computing, University of Vienna, Währinger Str. 29/6.21, 1090, Vienna / Wien, Austria
Siegfried Benkner
University of Tennessee, 37996, Knoxville, TN, USA
Jack J. Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Comprés Ureña, I.A., Gerndt, M., Trinitis, C. (2012). Wait-Free Message Passing Protocol for Non-coherent Shared Memory Architectures. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds) Recent Advances in the Message Passing Interface. EuroMPI 2012. Lecture Notes in Computer Science, vol 7490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33518-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-33518-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33517-4
Online ISBN: 978-3-642-33518-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics