Implementing MPI on Windows: Comparison with Common Approaches on Unix

  • Jayesh Krishna
  • Pavan Balaji
  • Ewing Lusk
  • Rajeev Thakur
  • Fabian Tiller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6305)


Commercial HPC applications are often run on clusters that use the Microsoft Windows operating system and need an MPI implementation that runs efficiently in the Windows environment. The MPI developer community, however, is more familiar with the issues involved in implementing MPI in a Unix environment. In this paper, we discuss some of the differences in implementing MPI on Windows and Unix, particularly with respect to issues such as asynchronous progress, process management, shared-memory access, and threads. We describe how we implement MPICH2 on Windows and exploit these Windows-specific features while still maintaining large parts of the code common with the Unix version. We also present performance results comparing the performance of MPICH2 on Unix and Windows on the same hardware. For zero-byte MPI messages, we measured excellent shared-memory latencies of 240 and 275 nanoseconds on Unix and Windows, respectively.


Unix System Large Message Small Message Operating System Service Present Performance Result 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Thakur, R.: Fine-grained multithreading support for hybrid threaded MPI programming. International Journal of High Performance Computing Applications 24(1), 49–57 (2010)CrossRefGoogle Scholar
  2. 2.
    Buntinas, D., Goglin, B., Goodell, D., Mercier, G., Moreaud, S.: Cache-efficient, intranode, large-message MPI communication with MPICH2-Nemesis. In: Proc. of the 2009 International Conference on Parallel Processing, pp. 462–469 (2009)Google Scholar
  3. 3.
    Buntinas, D., Mercier, G., Gropp, W.: Design and evaluation of Nemesis, a scalable, low-latency, message-passing communication subsystem. In: Proc. of 6th IEEE/ACM Int’l Symp. on Cluster Computing and the Grid (CCGrid) (May 2006)Google Scholar
  4. 4.
  5. 5.
    Gropp, W., Thakur, R.: Thread safety in an MPI implementation: Requirements and analysis. Parallel Computing 33(9), 595–604 (2007)CrossRefGoogle Scholar
  6. 6.
  7. 7.
    Message Passing Interface Forum: MPI: A Message-Passing Interface Standard, Version 2.2 (September 2009),
  8. 8.
    MPICH2 – A high-performance portable implementation of MPI,
  9. 9.
    MPI.NET: A high performance MPI library for.NET applications,
  10. 10.
    Microsoft MPI, Scholar
  11. 11.
    NetPIPE: A network protocol independent performance evaluator,
  12. 12.
    Network Direct: A low latency RDMA network API for Windows. Scholar
  13. 13.
  14. 14.
    Open Portable Atomics library,
  15. 15.
  16. 16.
    Top500 list (November 2008),

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jayesh Krishna
    • 1
  • Pavan Balaji
    • 1
  • Ewing Lusk
    • 1
  • Rajeev Thakur
    • 1
  • Fabian Tiller
    • 2
  1. 1.Argonne National LaboratoryArgonne
  2. 2.Microsoft CorporationRedmond

Personalised recommendations