Journal of Grid Computing

, Volume 1, Issue 2, pp 133–149 | Cite as

Towards Efficient Execution of MPI Applications on the Grid: Porting and Optimization Issues

  • Rainer Keller
  • Edgar Gabriel
  • Bettina Krammer
  • Matthias S. Müller
  • Michael M. Resch

Abstract

The message passing interface (MPI) is a standard used by many parallel scientific applications. It offers the advantage of a smoother migration path for porting applications from high performance computing systems to the Grid. In this paper Grid-enabled tools and libraries for developing MPI applications are presented. The first is MARMOT, a tool that checks the adherence of an application to the MPI standard. The second is PACX-MPI, an implementation of the MPI standard optimized for Grid environments. Besides the efficient development of the program, an optimal execution is of paramount importance for most scientific applications. We therefore discuss not only performance on the level of the MPI library, but also several application specific optimizations, e.g., for a sparse, parallel equation solver and an RNA folding code, like latency hiding, prefetching, caching and topology-aware algorithms.

computational Grids optimizations for communication hierarchies metacomputing MPI parallel debugging 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    G. Allen, T. Dramlitsch, I. Foster, N.T. Karonis, M. Ripeanu, E. Seidel, and B. Toonen, "Supporting Efficient Execution in Heterogeneous Distributed Computing Environments with Cactus and Globus", in Proceedings of the 2001 ACM/IEEE Supercomputing Conference (SC 2001), Denver, USA, 2001, CD-ROM.Google Scholar
  2. 2.
    R.L. Baldwin and G.D. Rose, "Is Protein Folding Hierarchic? II-Local Structure and Peptide Folding", Tibs, Vol. 24, pp. 77–83, 1999.Google Scholar
  3. 3.
    N. Barberou, M. Garbey, M. Hess, M. Resch, J. Toivanen, T. Rossi, and D. Tromeur-Dervout, "Aitken-Schwarz Method for Efficient Metacomputing of Elliptic Equations", in Proceedings of the Fourteenth Domain Decomposition Meeting, 2002, pp. 349–356.Google Scholar
  4. 4.
    F. Bernman, A. Chien, K. Cooper, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, K. Kennedy, C. Kesselman, L.T.D. Reed, and R. Wolski, "The GrADS Project: Software Support for High-Level Grid Application Development", International Journal of High Performance Applications and Supercomputing, Vol. 15, No. 4, pp. 327–344, 2001.Google Scholar
  5. 5.
    T.P. Bönisch and R. Rühle, "Adaptation of a 3-D Flow-Solver for Use in a Metacomputing Environment", in A.E. C.A. Lin, N. Satofuka, P. Fox, and J. Periaux (eds.), Parallel Computational Fluid Dynamics, Development and Applications of Par-allel Technology, North-Holland: Amsterdam, pp. 119–125, 1999.Google Scholar
  6. 6.
    A. Bouteiller, F. Cappello, T. Herault, G. Krawezik, P. Lemarinier, and F. Magniette, "MPICH-V2: A Fault Tolerant MPI for Volatile Nodes based on the Pessimistic Sender Based Message Logging", in Proceedings of the 2003 ACM/IEEE Supercomputing Conference (SC 2003), Phoenix, USA, 2003, http://www.sc-conference.org/sc2003/ paperpdfs/pap209.pdf.Google Scholar
  7. 7.
    H. Brunst, M. Winkler, W.E. Nagel, and H.-C. Hoppe, "Performance Optimization for Large Scale Computing: The Scalable VAMPIR Approach", in V. Alexandrov, J. Dongarra, B. Juliano, R. Renner, and C.J. Kenneth (eds.), International Conference on Computational Science-ICCS 2001, Vol. 2074, No. 2, Springer, 2001, pp. 751–760.Google Scholar
  8. 8.
    H. Brunst, W.E. Nagel, and H.-C. Hoppe, "Group-Based Performance Analysis of Multithreaded SMP Cluster Applications", in R. Sakellariou, J. Keane, J. Gurd, and L. Freeman (eds.), Euro-Par 2001 Parallel Processing, 2001, pp. 148–153.Google Scholar
  9. 9.
    G.E. Fagg and J.J. Dongarra, "FT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World", in J. Dongarra, P. Kacsuk, and N. Podhorszki (eds.), Recent Advances in Parallel Virtual Machine and Message Passing Interface, 7 th European PVM/MPI Users' Group Meeting, Balatonfüred, Lake Balaton, Hungary, Lecture Notes in Computer Science, Vol. 1908, Springer, 2000, pp. 346–353.Google Scholar
  10. 10.
    G.E. Fagg, K.S. London, and J.J. Dongarra, "MPI_Connect: Managing Heterogeneous MPI Applications Interoperation and Process Control", in V. Alexandrov and J. Dongarra. (eds.), Recent Advances in Parallel Virtual Machine and Mes-sage Passing Interface, 5th European PVM/MPI Users' Group Meeting, Lecture Notes in Computer Science, Vol. 1497, Springer, 1998, pp. 93–96.Google Scholar
  11. 11.
    R.W. Freund, "A Transpose Free Quasi-Minimal Residual Al-gorithm for Non-Hermitian Linear Systems", SIAM Journal of Scientific Computing, Vol. 14, No. 2, pp. 470–482, 1993.Google Scholar
  12. 12.
    E. Gabriel, M. Resch, and R. Rühle, "Implementing MPI with Optimized Algorithms for Metacomputing", in Message Passing Interface Developer's and Users Conference (MPIDC'99), Atlanta, 1999, pp. 31–41.Google Scholar
  13. 13.
    S. Genaud, A. Giersch, and F. Vivien, "Load-Balancing Scatter Operations for Grid Computing", in Proceedings of 12 th Heterogeneous Computing Workshop (HCW 2003), Nice, France, 2003.Google Scholar
  14. 14.
    S. Girona, J. Labarta, and R.M. Badia, "Validation of Dimemas Communication Model for MPI Collective Communications", in J. Dongarra, P. Kacsuk, and N. Podhorszki (eds.), Recent Advances in Parallel Virtual Machine and Message Passing Interface, 7th European PVM/MPI Users' Group Meeting, Balatonfüred, Lake Balaton, Hungary, Lecture Notes in Computer Science, Vol. 1908, 2000, pp. 39–46.Google Scholar
  15. 15.
    W.D. Gropp, "Runtime Checking of Datatype Signatures in MPI", in J. Dongarra, P. Kacsuk, and N. Podhorszki (eds.), Recent Advances in Parallel Virtual Machine and Message Passing Interface, 7th European PVM/MPI Users' Group Meeting, Balatonfüred, Lake Balaton, Hungary, Lecture Notes in Computer Science, Vol. 1908, Springer, 2000 pp. 160–167.Google Scholar
  16. 16.
    I.L. Hofacker, W. Fontana, L.S. Bonhoeffer, M. Tacker, and P. Schuster, "Vienna RNA Package", Internet, 2004, http:// www.tbi.univie.ac.at/ ivo/RNA.Google Scholar
  17. 17.
    R. Hood, "Debugging Computational Grid Programs with the Portable Parallel/Distributed Debugger (p2d2)", in The NASA HPCC Annual Report for 1999, NASA, 1999, http://hpcc. arc.nasa.gov:80/reports/report99/99index.htm.Google Scholar
  18. 18.
    T. Imamura, Y. Tsujita, H. Koide, and H. Takemiya, "An Architecture of Stampi: MPI Library on a Cluster of Parallel Computers", in J. Dongarra, P. Kacsuk, and N. Podhorszki (eds.), Recent Advances in Parallel Virtual Machine and Message Passing Interface, 7 th European PVM/MPI Users' Group Meeting, Balatonfüred, Lake Balaton, Hungary, Lecture Notes in Computer Science, Vol. 1908, Springer, 2000, pp. 200–207.Google Scholar
  19. 19.
    IMPI Steering Committee, "IMPI-Interoperable Message-Passing Interface", 1998, http://impi.nist.gov.Google Scholar
  20. 20.
    N. Karonis, B. de Supinski, I. Foster, W. Gropp, E. Lusk, and J. Bresnahan, "Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance", in Proceedings of the 14 th International Parallel Distributed Processing Symposium (IPDPS 2000), Cancun, Mexico, 2000, pp. 377–384.Google Scholar
  21. 21.
    N. Karonis, B. Toonen, and I. Foster, "MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface", Journal of Parallel and Distributed Computing, Vol. 63, No. 5, pp. 551–563, 2003.Google Scholar
  22. 22.
    K. Keahey, T. Fredian, Q. Peng, D.P. Schissel, M. Thompson, I. Foster, M. Greenwald, and D. McCune, "Computational Grids in Action: The National Fusion Collaboratory", Future Generation Computer Systems, Vol. 18, No. 8, pp. 1005–1015, 2002.Google Scholar
  23. 23.
    T. Kielmann, R.F.H. Hofman, H.E. Bal, A. Plaat, and R.A.F. Bhoedjang, "MagPIe: MPI's Collective Communication Operations for Clustered Wide Area Systems", in ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Ppopp'99), ACM, 1999, pp. 131–140.Google Scholar
  24. 24.
    B. Krammer, K. Bidmon, M.S. Müller, and M.M. Resch, "MARMOT, an MPI Analysis and Checking Tool", in Proceedings of PARCO 2003, accepted for publication.Google Scholar
  25. 25.
    C. Lee, S. Matsuoka, D. Talia, A. Sussmann, M. Müller, G. Allen, and J. Saltz, "A Grid Programming Primer", Global Grid Forum, 2001.Google Scholar
  26. 26.
    G. Luecke, Y. Zou, J. Coyle, J. Hoekstra, and M. Kraeva, "Deadlock Detection in MPI Programs", Concurrency and Computation: Practice and Experience, Vol. 14, pp. 911–932, 2002.Google Scholar
  27. 27.
    Message Passing Interface Forum, "MPI: A Message Passing Interface Standard", 1995, http://www.mpi-forum.org.Google Scholar
  28. 28.
    Message Passing Interface Forum, "MPI-2: Extensions to the Message Passing Interface", 1997, http://www.mpi-forum.org.Google Scholar
  29. 29.
    Message Passing Interface Forum, "MPI-2 Journal of Devel-opment", 1997, http://www.mpi-forum.org.Google Scholar
  30. 30.
    M.S. Müller, M. Hess, and E. Gabriel, "Grid Enabled MPI Solutions for Clusters", in The 3 rd IEEE/ACM International Symposium on Cluster Computing and the Grid, Toshi Center Hotel, Tokyo, 2003, pp. 18–25.Google Scholar
  31. 31.
    A. Plaat, H.E. Bal, R.F.H. Hofman, and T. Kielmann, "Sensitivity of Parallel Applications to Large Differences in Bandwidth and Latency in Two-Layer Interconnects", Future Generation Computer Systems, Vol. 16, No. 6, pp. 769–782, 2001.Google Scholar
  32. 32.
    S. Reynolds, "System Software Makes it Easy", Insights Magazine, NASA, 2000, http://hpcc.arc.nasa.gov:80/insights/ vol12.Google Scholar
  33. 33.
    H. Sivakumar, S. Bailey, and R.L. Grossman, "PSockets: The Case for Application-level Network Striping for Data Intensive Applications Using High Speed Wide Area Networks", in Proceedings of the 2000 ACM/IEEE Supercomputing Conference (SC 2000), Dallas, Texas, 2000, CD-ROM.Google Scholar
  34. 34.
    H. Stockinger, F. Donno, E. Laure, S. Muzaffar, P. Kunszt, G. Andronico, and P. Millar, "Grid Data Management in Action: Experience in Running and Supporting Data Man-agement Services in the EU DataGrid Project", in Computing in High Energy and Nuclear Physics (CHEP 2003), La Jolla, California, 2003, http://arxiv.org/pdf/cs.DC/0306011.Google Scholar
  35. 35.
    R. Thakur and W. Gropp, "Improving the Performance of Collective Operations in MPICH", in J. Dongarra, D. Laforenza, and S. Orlando (eds.), Recent Advances in Parallel Virtual Machine and Message Passing Interface, 10 th European PVM/MPI Users' Group Meeting, Venice, Italy, Lecture Notes in Computer Science, Vol. 2840, Springer, 2003, pp. 257–267.Google Scholar
  36. 36.
    S.S. Vadhiyar, G.E. Fagg, and J.J. Dongarra, "Towards an Accurate Model for Collective Communications", in Proceedings of International Conference on Computational Science (ICCS 2001), San Francisco, CA, Lecture Notes in Computer Science, Vol. 2073, Springer, 2001, pp. 41–50.Google Scholar
  37. 37.
    S. Vazhkudai, J. Schopf, and I. Foster, "Predicting the Performance of Wide Area Data Transfers", in International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002.Google Scholar
  38. 38.
    J. Vetter and B. de Supinski, "Dynamic Software Testing of MPI Applications with Umpire", in Proceedings of the 2000 ACM/IEEE Supercomputing Conference (SC 2000), Dallas, Texas, 2000, CD-ROM.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Rainer Keller
    • 1
  • Edgar Gabriel
    • 1
  • Bettina Krammer
    • 1
  • Matthias S. Müller
    • 1
  • Michael M. Resch
    • 1
  1. 1.High Performance Computing Center Stuttgart (HLRS), Stuttgart, Germany and Innovative Computing Laboratory, Computer Science DepartmentUniversity of TennesseeKnoxvilleUSA

Personalised recommendations