Advertisement

Reliable Orchestration of Distributed MPI-Applications in a UNICORE-Based Grid with MetaMPICH and MetaScheduling

  • Boris Bierbaum
  • Carsten Clauss
  • Thomas Eickermann
  • Lidia Kirtchakova
  • Arnold Krechel
  • Stephan Springstubbe
  • Oliver Wäldrich
  • Wolfgang Ziegler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4192)

Abstract

Running large MPI-applications with resource demands exceeding the local site’s cluster capacity could be distributed across a number of clusters in a Grid instead, to satisfy the demand. However, there are a number of drawbacks limiting the applicability of this approach: communication paths between compute nodes of different clusters usually provide lower bandwidth and higher latency than the cluster internal ones, MPI libraries use dedicated I/O-nodes for inter-cluster communication which become a bottleneck, missing tools for co-ordinating the availability of the different clusters across different administrative domains is another issue. To make the Grid approach efficient several prerequisites must be in place: an implementation of MPI providing high-performance communication mechanisms across the borders of clusters, a network connection with high bandwidth and low latency dedicated to the application, compute nodes made available to the application exclusively, and finally a Grid middleware glueing together everything. In this paper we present work recently completed in the VIOLA project: MetaMPICH, user controlled QoS of clusters and interconnecting network, a MetaScheduling Service and the UNICORE integration.

Keywords

MetaMPICH Grid Co-allocation UNICORE Network QoS 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    VIOLA – Vertically Integrated Optical Testbed for Large Application in DFN, 29 March 2006 (2006), http://www.viola-testbed.de/
  2. 2.
    Andrieux, A., Czajkowski, K., Dan, A., Keahey, K., Ludwig, H., Nakata, T., Pruyne, J., Rofrano, J., Tuecke, S., Xu, M.: WS-Agreement - Web Services Agreement Specification, April 19 (2006), https://forge.gridforum.org/projects/graap-wg/document/WS-AgreementSpecificationDraft.doc/en/31
  3. 3.
    Aumage, O., Mercier, G.: MPICH/MADIII: a Cluster of Clusters Enabled MPI Implementation. In: Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2003), Tokyo, Japan, pp. 26–36 (2003)Google Scholar
  4. 4.
    Beisel, T., Gabriel, E., Resch, M., Keller, R.: Distributed Computing in a Heterogeneous Computing Environment. In: Recent Advances in PVM and MPI – LNCS, pp. 180–187 (1998)Google Scholar
  5. 5.
    Clees, T.: AMG Strategies for PDE Systems with Applications in Industrial Semiconductor Simulation. In: Fraunhofer Series n Information and Communication Technology, Fraunhofer SCAI, Sankt Augustin, Germany, vol. 6 (2005)Google Scholar
  6. 6.
    Cristiano, K., Gruber, R., Keller, V., Kuonnen, P., Maffioletti, S., Nellari, N., Sawley, M.-C., Spada, M., Tran, T.-M., Wäldrich, O.: Integration of ISS into the VIOLA Meta-scheduling Environment. In: Proc. of the 2nd CoreGRID Integration Workshop. CoreGRID Series, vol. 4, pp. 47–54. Springer, Heidelberg (2006)Google Scholar
  7. 7.
    Erwin, D. (ed.): UNICOREi Plus Final Report. Technical report, Research Center Jülich, Germany (2003) ISBN 3-00-011592-7Google Scholar
  8. 8.
    Eickermann, T., Völpel, R., Wunderling, P.: Gigabit Testbed West – Final Report. Technical report, Research Center Jülich, Germany (2000)Google Scholar
  9. 9.
    Gropp, W., Lusk, E., Skjellum, A.: A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard. Parallel Computing 22(6), 789–828 (1996)zbMATHCrossRefGoogle Scholar
  10. 10.
    Karonis, N., Toonen, B., Foster, I.: MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface. Journal of Parallel and Distributed Computing (JPDC) 63(5), 551–563 (2003)zbMATHCrossRefGoogle Scholar
  11. 11.
    Krechel, A., Stüben, K.: Parallel algebraic multigrid based on subdomain blocking. Parallel Computing 27, 1009–1031 (2001)zbMATHCrossRefGoogle Scholar
  12. 12.
    Pöppe, M., Schuch, S., Bemmerl, T.: A Message Passing Interface Library for Inhomogeneous Coupled Clusters. In: Proc. of CAC Workshop at IPDPS 2003 (2003)Google Scholar
  13. 13.
    Pöppe, M., Schuch, S., Finocchiaro, R., Clauss, C., Worringen, J.: MP-MPICH User Documentation and Technical Notes, Aachen: Lehrstuhl für Betriebs-systeme, RWTH Aachen (2005)Google Scholar
  14. 14.
    Ramme, F., Romke, T., Kremer, K.: A Distributed Computing Center Software for the Efficient Use of Parallel Computer Systems. In: Gentzsch, W., Harms, U. (eds.) HPCN-Europe 1994. LNCS, vol. 797, pp. 129–136. Springer, Heidelberg (1994)Google Scholar
  15. 15.
    Streit, A., Erwin, D., Lippert, T., Mallmann, D., Menday, R., Rambadt, M., Riedel, M., Romberg, M., Schuller, B., Wieder, P.: UNICORE - From Project Results to Production Grids. In: Gandinetti, L. (ed.) Grid Computing: The new Frontiers of High Performance Processing, Advances in Parallel Computing, 14. Elsevier, Amsterdam (2005)Google Scholar
  16. 16.
    Stüben, K.: A Review of algebraic multigrid. Comp. Appl. Math. 128, 281–309 (2001)zbMATHCrossRefGoogle Scholar
  17. 17.
    Worringen, J.: SCI-MPICH: The Second Generation. In: Proceedings of SCI-Europe 2000 (Conference Stream of Euro-Par 2000), Munich, Germany, pp. 11–20 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Boris Bierbaum
    • 1
  • Carsten Clauss
    • 1
  • Thomas Eickermann
    • 2
  • Lidia Kirtchakova
    • 2
  • Arnold Krechel
    • 3
  • Stephan Springstubbe
    • 3
  • Oliver Wäldrich
    • 3
  • Wolfgang Ziegler
    • 3
  1. 1.Chair for Operating SystemsRWTH Aachen UniversityAachenGermany
  2. 2.Central Institute for Applied MathematicsResearch Centre JülichJülichGermany
  3. 3.Fraunhofer Institute SCAISankt AugustinGermany

Personalised recommendations