Abstract
Many parallel applications need to communicate non- contiguous data. Most applications manually copy (pack/unpack) data before communications even though MPI allows a zero-copy specification. In this work, we study two complex use-cases: (1) Fast Fourier Transformation where we express a local memory transpose as part of the datatype, and (2) a conjugate gradient solver with a checkerboard layout that requires multiple nested datatypes. We demonstrate significant speedups up to a factor of 3.8 and 18%, respectively, in both cases. Our work can be used as a template to utilize datatypes for application developers. For MPI implementers, we show two practically relevant access patterns that deserve special optimization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
MPI Forum: MPI: A Message-Passing Interface Standard. Version 2.2 (2009) http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf
The InfiniBand Trade Association: Infiniband Architecture Specification , Release 1.2. InfiniBand Trade Association vol.1(2003)
Kumar, S., et al.: The deep computing messaging framework: generalized scalable message passing on the blue gene/p supercomputer. In: ICS 2008: Proceedings of the 22nd annual international conference on Supercomputing, pp. 94–103. ACM, New York (2008)
Santhanaraman, G., Wu, J., Huang, W., Panda, D.K.: Designing zero-copy message passing interface derived datatype communication over infiniband: Alternative approaches and performance evaluation. Int. J. High Perform. Comput. Appl. 19, 129–142 (2005)
Träff, J.L., Hempel, R., Ritzdorf, H., Zimmermann, F.: Flattening on the fly: Efficient handling of mpi derived datatypes. In: Proceedings of the 6th European PVM/MPI Users’ Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, London, UK, pp. 109–116. Springer, Heidelberg (1999)
Gabriel, E., Resch, M., Rühle, R.: Implementing and benchmarking derived datatypes in metacomputing. In: HPCN Europe 2001: Proc. of the 9th Intl. Conference on High-Performance Computing and Networking, London, UK, pp. 493–502. Springer, Heidelberg (2001)
Gropp, W., Lusk, E., Swider, D.: Improving the performance of mpi derived datatypes. In: Proceedings of the Third MPI Developer’s and User’s Conference, pp. 25–30. MPI Software Technology Press (1999)
Byna, S., Gropp, W., Sun, X.H., Thakur, R.: Improving the performance of mpi derived datatypes by optimizing memory-access cost. In: IEEE International Conference on Cluster Computing, p. 412 (2003)
Lu, Q., Wu, J., Panda, D., Sadayappan, P.: Applying MPI Derived Datatypes to the NAS Benchmarks: A Case Study. In: Proc. of the Intl. Conf. on Par. Proc. Workshops (2004)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes in C: the art of scientific computing, 2nd edn. Cambridge University Press, Cambridge (1992)
Bernard, C., Ogilvie, M.C., DeGrand, T.A., DeTar, C.E., Gottlieb, S.A., Krasnitz, A., Sugar, R., Toussaint, D.: Studying Quarks and Gluons On Mimd Parallel Computers. International Journal of High Performance Computing Applications 5, 61–70 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hoefler, T., Gottlieb, S. (2010). Parallel Zero-Copy Algorithms for Fast Fourier Transform and Conjugate Gradient Using MPI Datatypes. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds) Recent Advances in the Message Passing Interface. EuroMPI 2010. Lecture Notes in Computer Science, vol 6305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15646-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-15646-5_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15645-8
Online ISBN: 978-3-642-15646-5
eBook Packages: Computer ScienceComputer Science (R0)