Scalable Memory Use in MPI: A Case Study with MPICH2

  • David Goodell
  • William Gropp
  • Xin Zhao
  • Rajeev Thakur
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6960)


One of the factors that can limit the scalability of MPI to exascale is the amount of memory consumed by the MPI implementation. In fact, some researchers believe that existing MPI implementations, if used unchanged, will themselves consume a large fraction of the available system memory at exascale. To investigate and address this issue, we undertook a study of the memory consumed by the MPICH2 implementation of MPI, with a focus on identifying parts of the code where the memory consumed per process scales linearly with the total number of processes. We report on the findings of this study and discuss ways to avoid the linear growth in memory consumption. We also describe specific optimizations that we implemented in MPICH2 to avoid this linear growth and present experimental results demonstrating the memory savings achieved and the impact on performance.


Memory Consumption Message Size Node Leader Memory Saving Virtual Connection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    ASC Sequoia Benchmark Codes: AMG, (May 2011)
  2. 2.
    Bailey, D., Harris, T., Saphir, W., Van Der Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0. NAS Technical Report NAS-95-020, NASA Ames Research Center, Moffett Field, CA (1995)Google Scholar
  3. 3.
    Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Hoefler, T., Kumar, S., Lusk, E., Thakur, R., Träff, J.L.: MPI on millions of cores. Parallel Processing Letters 21(1), 45–60 (2011)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Barbay, J., Navarro, G.: Compressed representations of permutations, and applications. In: Proc. of 26th Int’l Symposium on Theoretical Aspects of Computer Science (STACS), pp. 111–122 (2009)Google Scholar
  5. 5.
    Chaarawi, M., Gabriel, E.: Evaluating sparse data storage techniques for MPI groups and communicators. In: Bubak, M., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2008, Part I. LNCS, vol. 5101, pp. 297–306. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Kamal, H., Mirtaheri, S.M., Wagner, A.: Scalability of communicators and groups in MPI. In: Proc. of the ACM International Symposium on High Performance Distributed Computing, HPDC (2010)Google Scholar
  7. 7.
  8. 8.
    Stevens, R., White, A.: Report of the workshop on architectures and technologies for extreme scale computing (December 2009),
  9. 9.
    Träff, J.L.: Compact and efficient implementation of the MPI group operations. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 170–178. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    uthash (May 2011),

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • David Goodell
    • 1
  • William Gropp
    • 2
  • Xin Zhao
    • 2
  • Rajeev Thakur
    • 1
  1. 1.Argonne National Lab.ArgonneUSA
  2. 2.Univ. of IllinoisUrbanaUSA

Personalised recommendations