Recent Advances in Parallel Virtual Machine and Message Passing Interface

Volume 5759 of the series Lecture Notes in Computer Science pp 20-30

MPI on a Million Processors

  • Pavan BalajiAffiliated withArgonne National Laboratory
  • , Darius BuntinasAffiliated withArgonne National Laboratory
  • , David GoodellAffiliated withArgonne National Laboratory
  • , William GroppAffiliated withUniversity of Illinois
  • , Sameer KumarAffiliated withIBM T.J. Watson Research Center, Yorktown Heights
  • , Ewing LuskAffiliated withArgonne National Laboratory
  • , Rajeev ThakurAffiliated withArgonne National Laboratory
  • , Jesper Larsson TräffAffiliated withNEC Laboratories Europe

* Final gross prices may vary according to local VAT.

Get Access


Petascale machines with close to a million processors will soon be available. Although MPI is the dominant programming model today, some researchers and users wonder (and perhaps even doubt) whether MPI will scale to such large processor counts. In this paper, we examine this issue of how scalable is MPI. We first examine the MPI specification itself and discuss areas with scalability concerns and how they can be overcome. We then investigate issues that an MPI implementation must address to be scalable. We ran some experiments to measure MPI memory consumption at scale on up to 131,072 processes or 80% of the IBM Blue Gene/P system at Argonne National Laboratory. Based on the results, we tuned the MPI implementation to reduce its memory footprint. We also discuss issues in application algorithmic scalability to large process counts and features of MPI that enable the use of other techniques to overcome scalability limitations in applications.