Chapter

Recent Advances in the Message Passing Interface

Volume 6305 of the series Lecture Notes in Computer Science pp 132-141

Parallel Zero-Copy Algorithms for Fast Fourier Transform and Conjugate Gradient Using MPI Datatypes

  • Torsten HoeflerAffiliated withNational Center for Supercomputing Applications, University of Illinois at Urbana-Champaign
  • , Steven GottliebAffiliated withNational Center for Supercomputing Applications, University of Illinois at Urbana-Champaign

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Many parallel applications need to communicate non- contiguous data. Most applications manually copy (pack/unpack) data before communications even though MPI allows a zero-copy specification. In this work, we study two complex use-cases: (1) Fast Fourier Transformation where we express a local memory transpose as part of the datatype, and (2) a conjugate gradient solver with a checkerboard layout that requires multiple nested datatypes. We demonstrate significant speedups up to a factor of 3.8 and 18%, respectively, in both cases. Our work can be used as a template to utilize datatypes for application developers. For MPI implementers, we show two practically relevant access patterns that deserve special optimization.